Signatures and determinants associated with prostate cancer progression and methods of use thereof

ABSTRACT

The present invention provides a set of DETERMINANTS (e.g., genes and gene products) that can accurately inform about the risk of cancer progression and recurrence, as well as methods of their use.

CROSS REFERENCES TO OTHER APPLICATIONS

This application claims priority from U.S. Provisional Application61/501,536, filed Jun. 27, 2011 and from U.S. Provisional Application61/582,787, filed Jan. 3, 2012. The disclosures of those applicationsare incorporated by reference herein in their entirety.

GOVERNMENT INTEREST

This invention was made with government support under U01-CA84313,U01-CA84313 and R01CA84628 each awarded by the National Cancer Instituteand W81XWH-07-PCRP-IDA awarded by the Department of Defense. The UnitedStates government has certain rights in the invention.

FIELD OF THE INVENTION

The present invention relates generally to the identification ofbiological signatures associated with and genetic determinants effectingcancer and methods of using such biological signatures and determinantin the screening, prevention, diagnosis, therapy, monitoring, andprognosis of cancer. The invention further relates to geneticallyengineered mouse models of metastatic prostate cancer.

BACKGROUND OF THE INVENTION

Prostate cancer (PCA) is the most firequent male cancer and a leadingcause of cancer death in US. Most elderly men harbor prostatic neoplasiawith the vast majority of cases remaining localized and indolent withoutneed for therapeutic intervention.

Current methods of stratifying tumors to predict outcome are based onclinicopathological factors including Gleason grade, PSA, and tumorstage. Although these formulae are helpful, they do not fully predictoutcome and importantly are not reliably linked to the most meaningfulclinical endpoints of risk of metastatic disease and PCA-specific death.This unmet medical need has fueled efforts to define the genetic andbiological bases of PCA progression with the goals of identifyingbiomarkers capable to assigning progression risk and providingopportunities for targeted interventional therapies. Genetic studies ofhuman PCA has identified a number of signature events including PTENtumor suppressor inactivation and ETS family translocation anddysregulation, as well as many other important genetic and/or epigeneticalterations including Nkx3.1, c-Myc and SPINK. Global molecular analyseshave also identified an array of potential recurrence/metastasisbiomarkers, such as ECAD, AIPC, Pim-1 Kinase, hepsin, AMACR, and EZH2.However, the intense heterogeneity of human PCA has limited the utilityof single biomarkers in the clinical setting, thus prompting morecomprehensive transcriptional profiling studies to define prognosticmulti-gene biomarker panels or signatures. Furthermore, the clinicalutility ofthese predictive signatures have remained uncertain due to theinherent noise and context-specific nature of transcriptional networksand the extreme instability ofcancer genomes with myriad bystandergenetic and epigenetic events producing significant diseaseheterogeneity. These factors have conspired to impede the identificationof biomarkers capable of accurately assigning risk of diseaseprogression. Accordingly, a need exists for more accurate models ofhuman cancer that can be used together with complex human datasets toidentify robust biomarkers that can be used to predict the occurrenceand the behavior of cancer, particularly at an early stage.

In this invention, we have generated new engineered mouse models ofprostate cancer and identified genes and pathways associated withprostate cancer progression. This model, coupled with comparison ofhuman data and functional validation, has led to discovery of many newtherapeutic targets as well as prognostic markers with strong clinicalrelevance in metastatic cancer.

SUMMARY OF THE INVENTION

The present invention relates in part to the discovery that certainbiological markers (referred to herein as “DETERMINANTS”), such asproteins, nucleic acids, polymorphisms, metabolites, and other analytes,as well as certain physiological conditions and states, are present oraltered which endow these neoplasm with an increased risk of recurrenceand progression to metastatic cancer.

The invention provides a method for predicting prognosis of a cancerpatient. In this method, one obtains a tissue sample from the patient,and measures the levels of two or more DETERMINANTS selected from Table2, 3 or 5 (see below) in the sample, wherein the measured levels areindicative of the prognosis of the cancer patient. In some embodiments,the levels of two, three, four, five, six, seven, eight, nine, ten,fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty,thirty, forty, fifty, or more DETERMINANTS are measured.

In some embodiments, the prognosis may be that the patient is at a lowrisk of having metastatic cancer or recurrence of cancer. In otherembodiments, the prognosis may be that the patient is at a high risk ofhaving metastatic cancer or recurrence of cancer. In these embodiments,the patient may have melanoma, breast cancer, prostate cancer, or coloncancer.

In some embodiments, an increased risk of cancer recurrence ordeveloping metastatic cancer in the patient is determined by measuring aclinically significant alteration in the levels of the selectedDETERMINANTS in the sample. Alternatively, an increased risk ofdeveloping metastatic prostate cancer in the patient is determined bycomparing the levels of the selected DETERMINANTS to a reference value.In some embodiments, the reference value is an index.

The invention also provides a method for analyzing a tissue sample froma cancer patient. In this method, one obtains the tissue sample from thepatient, measures the levels of two or more DETERMINANTS selected fromTable 2, 3 or 5 in the sample.

This invention additionally provides a method for identifying a cancerpatient in need of adjuvant therapy. In this method, one obtains atissue sample from the patient, measures the levels of two or moreDETERMINANTS selected from Table 2, 3 or 5 in the sample, wherein themeasured levels indicate that the patient is in need of adjuvanttherapy. For example, the adjuvant therapy may be selected from thegroup consisting of radiation therapy, chemotherapy, immunotherapy,hormone therapy, and targeted therapy. In some embodiments, the patienthas been subjected to a standard of care therapy. In some embodiments,the targeted therapy targets another component of a signaling pathway inwhich one or more of the selected DETERMINANTS is a component. Inalternative embodiments, the targeted therapy targets one or more of theselected DETERMINANTS.

This invention also provides a further method for treating a cancerpatient. In this method, one measures the levels of two or moreDETERMINANTS selected from Table 2, 3 or 5 in a tissue sample from thepatient, and treats the patient with adjuvant therapy if the measuredlevels indicate that the patient is at a high risk of having metastaticcancer or recurrence of cancer. In some embodiments, the adjuvanttherapy is an experimental therapy.

This invention additionally provides a method for monitoring theprogression of a tumor in a patient. In this method, one obtains a tumortissue sample from the patient; and measures the levels of two or moreDETERMINANTS selected from Table 2, 3 or 5 in the sample, and whereinthe measured levels are indicative of the progression of the tumor inthe patient. In some embodiments, a clinically significant alteration inthe measured levels between the tumor tissue sample taken form thepatient at two different time points is indicative of the progression ofthe tumor in the patient. In some embodiments, the progression of atumor in a patient is measured by detecting the levels of the selectedDETERMINANTS in a first sample from the patient taken at a first periodof time, detecting the levels of the selected DETERMINANTS in a secondsample from the patient taken at a second period of time and thencomparing the levels of the selected DETERMINANTS to a reference value.In some aspects, the first sample is taken from the patient prior tobeing treated for the tumor and the second sample is taken from thepatient after being treated for the tumor.

The invention also provides a method for monitoring the effectiveness oftreatment or selecting a treatment regimen for a recurrent or metastaticcancer in a patient by measuring the levels of two or more DETERMINANTSselected from Table 2, 3, or 5 in a first sample from the patient takenat a first period of time and optionally measuring the level of theselected DETERMINANTS in a second sample from the patient taken at asecond period of time. The levels of the selected DETERMINANTS detectedat the first period of time are compared to the levels detected at thesecond period of time or alternatively a reference value. Theeffectiveness of treatment is monitored by a change in the measuredlevels of the selected DETERMINANTS from the patient.

The invention also provides a kit for measuring the levels of two ormore DETERMINANTS selected from Table 2, 3 or 5. The kit comprisesreagents for specifically measuring the levels of the selectedDETERMINANTS. In some embodiments, the reagents are nucleic acidmolecules. In these embodiments, the nucleic acid molecules are PCRprimers or hybridizing probes. In alternative embodiments, the reagentsare antibodies or fragments thereof, oligonucleotides, or aptamers.

This invention also provides a method for treating a cancer patient inneed thereof. In this method, one measures the level of a DETERMINANTselected from Table 2, 3 or 5, and administers an agent that modulatesthe level of the selected DETERMINANT. In some embodiments, theadministered agent may be a small molecule modulator. In someembodiments, the administered agent may be a small molecule inhibitor.In some embodiments, the administered agent may be, for example, siRNAor an antibody or fragment thereof. In some embodiments, the selectedDETERMINANT is AGPAT6, ATAD2, ATP6V1C, AZIN1, COX6C, CPNE3, DPYS, EBAG9,EFR3A, EXT1, GRINA, HRSP12, KIAA0196, MAL2, MTDH, NSMCE2, NUDCD1, PDE7A,POLR2K, POP1, PTK2, SPAG1, SQLE, SR1, STK3, TAF2, TGS1, TMEM65, TMEM68,TOP1MT, UBR5, WDYHV1, WWP1, or YWHAZ. In some embodiments, the selectedDETERMINANT is AGPAT6, AZIN1, CPNE3, DPYS, NSMCE2, NUDCD1, SR1, TGS1,UBR5, or WDYHV1.

This invention also provides a method of identifying a compound capableof reducing the risk of cancer recurrence or development of metastaticcancer. In this method, one provides a cell expressing a DETERMINANTselected from Table 2, 3 or 5, contacts the cell with a candidatecompound, and determines whether the candidate compound alters theexpression or activity of the selected DETERMINANT, whereby thealteration observed in the presence of the compound indicates that thecompound is capable of reducing the risk of cancer recurrence ordevelopment of metastatic cancer.

This invention also provides a method of identifying a compound capableof treating cancer. In this method, one provides a cell expressing aDETERMINANT selected from Table 2, 3 or 5, contacts the cell with acandidate compound, and determines whether the candidate compound altersthe expression or activity of the selected biomarker, whereby thealteration observed in the presence of the compound indicates that thecompound is capable of treating cancer.

This invention also provides a method of identifying a compound capableof reducing the risk of cancer occurrence or development of cancer. Inthis method, one provides a cell expressing a DETERMINANT selected fromTable 2, 3 or 5, contacts the cell with a candidate compound, anddetermines whether the candidate compound alters the expression oractivity of the selected biomarker, whereby the alteration observed inthe presence of the compound indicates that the compound is capable ofreducing the risk of cancer occurrence or development of cancer.

According to the invention, a DETERMINANT that can be used in themethods or kits provided by the invention may be selected from theDETERMINANTS listed on Table 2, Table 3, or Table 5. In someembodiments, the selected DETERMINANTS may comprise one or more ofMAP3K8, RAD21, and TUSC3. In some embodiments, the selected DETERMINANTSmay comprise one or more of ATP5A1, ATP6V1C1, CUL2, CYC1, DCC, ERCC3,MBD2, MTERF, PARD3, PTK2, RBL2, SMAD2, SMAD4, SMAD7, DNAJC15, KIF5B,LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL, DSC2, PCDH9, WDR7, LAMA3, PCDH8,MKX, MSR1, and POLR2K. In some embodiments, the selected DETERMINANTSmay comprise one or more of ATP5A1, ATP6V1C1, CUL2, CYC1, DCC, ERCC3,MBD2, MTERF, PARD3, PTK2, RBL2, SMAD2, SMAD4, and SMAD7. In someembodiments, the selected DETERMINANTS may comprise one or more ofDNAJC15, KIF5B, LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL, DSC2, PCDH9,SMAD7, WDR7, LAMA3, PCDH8, MKX, MSR1, and POLR2K. In some embodiments,the selected DETERMINANTS comprise one or more of DNAJC15, KIF5B, LECT1,DSG2, ACAA2, ASAP1, and LMO7. In some embodiments, the selectedDETERMINANTS comprise one or more of SVIL, DSC2, PCDH9, SMAD7, WDR7,LAMA3, PCDH8, MKX, MSR1, and POLR2K. In various embodiments, the methodsor the kits provided by the invention further comprise measuring thelevels of one or more of PTEN, cyclin D1, SMAD4, and SPP1. In someembodiments, the selected DETERMINANTS comprise two or more of ATP5A1,ATP6VIC1, CUL2, CYC1, DCC, ERCC3, MBD2, MTERF, PARD3, PTK2, RBL2, SMAD2,SMAD4, SMAD7, DNAJC15, KIF5B, LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL,DSC2, PCDH9, WDR7, LAMA3, PCDH8, MKX, MSR1, POLR2K, PTEN, cyclin D1 andSPP1. See, for example, Table 7 for two-DETERMINANT combination.

In another aspect of the invention, the selected DETERMINANTS areassociated with DNA gain or the selected DETERMINANTS have a clinicallysignificant increase in the measured levels. For example, theDETERMINANTS are selected from 1) the group consisting of DETERMINANTS1-300 of Table 2: or 2) the group consisting of DETERMINANTS 8, 9, 12,13, 18-20, 22, 23, 34, 41, 48-50, 56, 64, 65, 70, 72-79, 81, 84, 87, 88,102, 104, 114, 124, 134, 139, 154, 169-172, 185, 186, 193, 196, 199,203, 204, 207, 209, 212, 217, 218, 221, 245, 247, 248, 254, 257, 260,263, 264, 268, 269, 277, 279-281, 283, 284, 286, 287, 292, 294, and296-298 of Table 3; or 3) the group consisting of DETERMINANTS 9, 12,18, 19, 22, 23, 41, 56, 70, 72, 75, 76, 77, 87, 102, 114, 134, 139, 171,172, 185, 196, 199, 207, 209, 217, 221, 247, 248, 260, 263, 264, 268,269, 279, 296, and 298 of Table 5.

In another aspect of the invention, the selected DETERMINANTS areassociated with DNA loss or the selected DETERMINANTS have a clinicallysignificant decrease in the measured levels. For example, theDETERMINANTS are selected from 1) the group consisting of DETERMINANTS301-741 of Table 2; 2) DETERMINANTS 303, 308, 310, 312, 313, 316, 319,322, 324, 326, 328, 329, 343, 353, 360, 368, 371, 376, 378, 384, 386,389, 391, 392, 398, 400, 403, 404, 405, 406, 407, 412, 416, 421, 422,424, 430, 432, 435, 437, 440, 445, 446, 451, 459, 466, 468, 469, 470,471, 473, 477, 481, 482, 484, 485, 486, 487, 490, 492, 493, 494, 496,498, 502, 503, 505, 506, 509, 514, 515, 520, 521, 522, 525, 526, 527,530, 533, 534, 536, 542, 547, 552, 553, 554, 555, 563, 570, 580, 582,584, 585, 586, 587, 589, 592, 594, 595, 599, 600, 603, 604, 612, 614,616, 617, 622, 625, 628, 629, 632, 637, 642, 648, 651, 652, 654, 655,656, 658, 659, 660, 661, 662, 666, 669, 670, 671, 675, 676, 680, 681,682, 685, 687, 689, 690, 691, 692, 695, 711, 718, 719, 722, 723, 724,725, 730, 735, and 740 of Table 3; or 3) DETERMINANTS 308, 312, 319,322, 324, 326, 328, 329, 343, 371, 378, 386, 389, 391, 392, 400, 416,422, 424, 440, 445, 466, 471, 481, 482, 484, 490, 492, 493, 494, 496,498, 503, 505, 506, 514, 521, 522, 525, 526, 527, 533, 542, 554, 555,570, 582, 584, 585, 586, 592, 594, 595, 612, 617, 628, 629, 637, 642,648, 651, 658, 659, 660, 661, 675, 680, 681, 685, 687, 692, 718, 719,723, 730, and 735 of Table 5.

In some embodiments, at least one of the selected DETERMINANTS isassociated with REACTOME TGF-beta signaling pathway. In someembodiments, at least one of the selected DETERMINANTS is associatedwith BIOCARTA TGF-beta signaling pathway. In some embodiments, at leastone of the selected DETERMINANTS is associated with KEGG TGF-bctasignaling pathway. In some embodiments, at least one of the selectedDETERMINANTS is associated with KEGG colorectal cancer pathway. In someembodiments, at least one of the selected DETERMINANTS is associatedwith KEGG Adherens junction pathway. In some embodiments, at least oneof the selected DETERMINANTS is associated with REACTOME RNA PolymeraseI/III and mitochondrial transcription pathway. In some embodiments, atleast one of the selected DETERMINANTS is associated with KEGG cellcycle pathway. In some embodiments, at least one of the selectedDETERMINANTS is associated with KEGG oxidative phosphorylation pathway.In some embodiments, at least one of the selected DETERMINANTS isassociated with KEGG pathways in cancer.

In another aspect of the invention, at least one of the selectedDETERMINANTS is associated with DNA gain and at least one of theselected DETERMINANTS is associated with DNA loss. In some embodiments,at least one of the selected DETERMINANTS has a clinically significantincrease in the measured levels and at least one of the selectedDETERMINANTS has a clinically significant decrease in the measuredlevels.

The levels of the selected DETERMINANTS may be measuredelectrophoretically or immunochemically. For example, the levels of theselected DETERMINANTS are detected by radioimmunoassay,immunofluorescence assay or by an enzyme-linked immunosorbent assay.Optionally, the DETERMINANTS are detected using non-invasive imagingtechnology.

In some embodiments, the levels of the selected DETERMINANTS aredetermined based on the DNA copy number alteration. In theseembodiments, the DNA copy number alteration of the selected DETERMINANTindicates DNA gain or loss. In some embodiments, the RNA transcriptlevels of the selected DETERMINANTS are measured. In certainembodiments, the RNA transcript levels may be determined by microarray,quantitative RT-PCR, sequencing, nCounter® multiparameter quantitativedetection assay (NanoString), branched DNA assay (e.g., PanomicsQuantiGene® Plex technology), or quantitative nuclease protection assay(e.g., Highthroughput Genomics qNPA™). nCounter® system is developed byNanoString Technology. It is based on direct multiplexed measurement ofgene expression and capable ofproviding high levels of precision andsensitivity (<1 copy per cell) (see72.5.117.165/applications/technology/). In particular, the nCounter®assay uses molecular “barcodes” and single molecule imaging to detectand count hundreds of unique transcripts in a single reaction. PanomicsQuantiGene® Plex technology can also be used to assess the RNAexpression of DETERMINANTS in this invention. The QuantiGene® platformis based on the branched DNA technology, a sandwich nucleic acidhybridization assay that provides a unique approach for RNA detectionand quantification by amplifying the reporter signal rather than thesequence (Flagella et al., Analytical Biochemistry (2006)). It canreliably measure quantitatively RNA expression in fresh, frozen orformalin-fixed, paraffin-embedded (FFPE) tissue homogenates (Knudsen etal., Journal of Molecular Diagnostics (2008)). In some embodiments, theprotein levels of the selected DETERMINANTS are measured. In certainembodiments, the protein levels may be measured, for example, byantibodies, immunohistochemistry or immuno fluorescence. In theseembodiments, the protein levels may be measured in subcellularcompartments, for example, by measuring the protein levels ofDETERMINANTS in the nucleus relative to the protein levels of theDETERMINANTS in the cytoplasm. In some embodiments, the protein levelsof DETERMINANTS may be measured in the nucleus and/or in the cytoplasm.

In some embodiments, the levels of the DETERMINANTS may be measuredseparately. Alternatively, the levels of the DETERMINANTS may bemeasured in a multiplex reaction.

In some embodiments, the noncancerous cells are excluded from the tissuesample. In some embodiments, the tissue sample is a solid tissue sample,a bodily fluid sample, or circulating tumor cells. In some embodiments,the bodily fluid sample may be blood, plasma, urine, saliva, lymphfluid, cerebrospinal fluid (CSF), synovial fluid, cystic fluid, ascites,pleural effusion, interstitial fluid, or ocular fluid. In someembodiments, the solid tissue sample may be a formalin-fixed paraffinembedded tissue sample, a snap-frozen tissue sample, an ethanol-fixedtissue sample, a tissue sample fixed with an organic solvent, a tissuesample fixed with plastic or epoxy, a cross-linked tissue sample,surgically removed tumor tissue, or a biopsy sample (e.g., a corebiopsy, an excisional tissue biopsy, or an incisional tissue biopsy). Insome embodiments, the tissue sample is a cancerous tissue sample. Insome embodiments, the cancerous tissue is melanoma, prostate cancer,breast cancer, or colon cancer tissue.

In some embodiments, at least one standard parameter associated with thecancer is measured in addition to the measured levels of the selectedDETERMINANTS. The at least one standard parameter may be, for example,tumor stage, tumor grade, tumor size, tumor visual characteristics,tumor location, tumor growth, lymph node status, tumor thickness(Breslow score), ulceration, age of onset, PSA level, PSA kinetics, orGleason score.

In some embodiments, the patient may have a primary tumor, a recurrenttumor, or metastatic prostate cancer.

Also included in the invention is metastatic prostate cancer referenceexpression profile containing a pattern of marker levels of an effectiveamount of two or more markers selected from Tables 2, 3 or 5. Alsoincluded is a machine readable media containing one or more metastatictumor reference expression profiles and optionally, additional testresults and subject information. In a further aspect the inventionprovides a DETERMINANT panel containing one or more DETERMINANTS thatare indicative of a physiological or biochemical pathway associatedmetastasis or the progression of a tumor.

The invention also provides a mouse wherein the genome of at least oneprostate epithelial cell contains a homozygous inactivation of theendogenous PTEN gene, p53 gene, and TERT gene, and the TERT gene can beinducibly re-activated and therefore expressed, and wherein the mouseexhibits an increased susceptibility to development of metastaticprostate cancer upon expression of the TERT gene.

The invention also provides a mouse wherein the genome of at least oneprostate epithelial cell contains a homozygous inactivation of theendogenous PTEN gene, p53 gene, and SMAD4 gene, and wherein the mouseexhibits an increased susceptibility to development of metastaticprostate cancer. The invention also provides cells from the mouse modelsand in some aspects, such cells are epithelial cancer cells.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention pertains. Although methods and materialssimilar or equivalent to those described herein can be used in thepractice of the present invention, suitable methods and materials aredescribed below. All publications, patent applications, patents, andother references mentioned herein are expressly incorporated byreference in their entirety. In cases of conflict, the presentspecification, including definitions, will control. In addition, thematerials, methods, and examples described herein are illustrative onlyand are not intended to be limiting.

Other features and advantages of the invention will be apparent from andencompassed by the following detailed description and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Telomere dysfunction inhibited prostate tumor growth, whiletelomerase reactivation on the background of telomere dysfunction rescueprostate tumor growth. (A) Knock-in strategy for LSL-mTert construct.Cre-mediated recombination can remove the LSL cassette only in theprostate by PB-Cre4 to restore endogenous mTert expression in epithelialcells. (B,C) Later generations (G3, G4) of mTerr or LSL-mTertPB-Pten/p53 allele mouse generates telomere dysfunction were shown bydecreased weight of the testis (B), and an increase of the apoptoticbodies in intestinal crypts (C). (D-F) Quantification of body weight(D), the weight of testis (E), and apoptotic bodies per 100 intestinalcrypts of G0 mTert^(+/+) PB-Pren/p53 (denoted as G0 mTert^(−/+), n=20),G4 mTert^(−/−) PB-Pten/p53 (denoted as G4 mTert^(−/−), n=31), and G4LSL-mTert PB-Pten/p53 (denoted as G4 LSL-mTert, n=20) mice (F). Errorbars represent s.d.*, p<0.05.

FIG. 2. Telomere dysfunction inhibited prostate tumor progression fromHPIN to invasive tumor, while telomerase reactivation on background oftelomere dysfunction promotes aggressive spread of G3/4 LSL-mTertPB-Pten/p53 prostate tumors to spinal bones. (A) Gross anatomy ofrepresentative prostates at 24 weeks of age. (B) Quantification of theprostate tumor to body weight ratio of G0 mTert^(+/+) (n=20), G4mTert^(−/−) (n=31), and G4 LSL-Tert (n=20) mice. Error bars represents.d. (C) H&E sections of the prostate tumors from G0 mTert^(−/−), G4mTert^(+/+), and LSL-G4 Tert at 24 weeks of age. (D) Quantification ofthe invasive prostate tumors of G0 mTert^(+/+) (n=20), G4 mTert^(−/−)(n=31), and G4 LSL-Tert (n=20) mice. (E) H&E sections of the prostatetumors of G4 LSL-Tert in spinal bone at 24 weeks of age. (F)Quantification of the mice with prostate tumors spotted in the spinalbones of G0 mTert^(+/+) (n=20), G4 mTert^(−/+) (n=31), and G4 LSL-Tert(n=20) mice.

FIG. 3 Telomerase reactivation maintains telomere length and allowstumor cells to proliferate. (A) Telomere dysfunction induced a strongp53BP1 signal in G4 mTert^(−/−) cells (panel b), but the G4 LSL-Tertcells were significantly rescued (panel c). (B) Quantification of p53BP1positive prostate tumor. Error bars represent s.d. for a representativeexperiment performed in triplicate. (C-E) Telomere dysfunction inducedapoptosis and blockage of proliferation. Quantification of TUNELpositive (C), Caspase-3 activation positive (D), and Ki67 positiveprostate tumor cells (E). Error bars represent s.d. for a representativeexperiment performed in triplicate.

FIG. 4. Oncogenomic alterations that occur in G3/4 LSL-mTert prostatetumors. (A) Representative SKY images from metaphase spreads from G0(panel a) and G3 and G4 (panel b) prostate tumors. (B) Quantification ofcytogenetic aberrations (recurrences) detected by SKY in G0mTert+/+PB-Pten/p53 and G3/4 G4 LSL-mTert PB-Pten/p53. (C)Quantification of cytogenetic aberrations (recurrences) detected by SKYin G0 mTert+/+PB-Pten/p53 (green) and G3/4 G4 LSL-mTert PB-Pten/p53(red) prostate tumors. (D) Recurrence plot of CNAs defined by array-CGHfor 18 mouse prostate tumors. The x axis shows the physical location ofeach chromosome. The percentage of prostate tumors harboring gains(bright red, log 2>0.6, losses (green, log 2<−0.3), and deletions (darkgreen, log 2<−0.6) for each locus is depicted.

FIG. 5. Co-deletion of SMAD4 together with PTEN and TP53 lead toaggressive prostate cancer progression. (A-B) Log₂ ratio of array-CGHplots showing conserved deletion of SMAD4 in both mouse G3,G4LSL-mTert-PB-Pten/p53 (A) and human prostate sample (B). They axis showslog₂ of copy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.(C-D) Log₂ ratio of array-CGH plots showing co-deletion of PTEN (C) andTP53 (D) in that same human prostate sample that with SMAD4 deletion.(E) Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer samples (n=194). The P-value (Fisher's exact test)=2.9e-6(asterisk-). (F) Survival curves showing significant decrease inlifespan in the PB-Pten/p53/Smad4 (n=24) (asterisk) compared with thePB-Pren/p53 cohort (n=25) or PB-Pten/Smad4 (n=44) by Kaplan-Meieroverall cumulative survival analysis (P<0.0001). (G) H&E sections of theprostate tumors of PB-Pten/p53/Smad4 in spinal bone at 19 weeks of age.

FIG. 6. Breeding scheme used to produce the experimental cohorts. 5alleles of mTerr, LSL-mTert^(L/L), Pten^(L/L), p53^(L/L), PB-Cre4 wereused to generate the telomere intact mTert^(+/+) PB-Pten/p53 mice, G3/4telomere dysfunctional mTert^(−/−) mice, G3/4 telomerase reactivation onthe backdrop of telomere dysfunctional LSL-Tert mice.

FIG. 7. Histological analyses revealed presence of high-grade prostateintraepithelial neoplasia (HPIN) by age 9 weeks in all three cohorts.(A-C) H&E sections of the HPIN in the anterior prostate (AP) tumors atage of 9 weeks from G0 mTert+/+PB-Pten/p53 (A), G4 mTert−/− PB-Pten/p53(B), and G4 LSL-mTert PB-Pten/p53 (C).

FIG. 8. Relative to G0 mTert+/+PB-Pten/p53 samples, telomere reserveswere significantly decreased in G4 mTert−/− PB-Pten/p53 samples and wereintermediate in the G4 LSL-mTert PB-Pten/p53 sample. (A) Telomere insitu FISH (spell out) of prostate tumors shows severe telomere erosionin G4 mTer^(t−/−) cells (panel b), compared to G0 mTer^(t+/+). cells(panel a). Telomeres of G4 LSL-Tert cells were significantly maintained(panel c), compared to G4 mTer^(t−/−) cells. (B) Relative telomerelength in prostate tumors. Error bars represent s.d. for at least 4 to 6independent measurements for each genotype.

FIG. 9. Oncogenomic alterations that occur in G3/4 G4 mTert−/−PB-Pten/p53 prostate tumors. (A) H&E sections of the prostate tumorsfrom the invasion escape of G4 mTert−/− PB-Pren/p53 at 24 weeks of age.(B) Representative SKY images from metaphase spreads from mTert−/−PB-Pten/p53 prostate tumors. (C) Quantification of cytogeneticaberrations (recurrences) detected by SKY in mTert−/− PB-Pten/p53prostate tumors. (D) Quantification of cytogenetic aberrations(recurrences) detected by SKY in G0 mTert+/+PB-Pten/p53, G3/4 G4mTert−/− PB-Pten/p53, and G3/4 G4 LSL-mTert PB-Pten/p53 prostate tumors.

FIG. 10. Genomic alterations in both mouse and human prostate tumorcells and derivation of 113 (37 amp and 76 del) genes correlated withbone metastasis. There are a total of 94 MCRs in the aCGH dataset ofG3/G4 LSL-Tert prostate tumors (n=18). There are 741 genes (300 amp and441 del) having the same genomic alteration pattern of amplification ordeletion between the mouse prostate tumor dataset and Taylor et al(2010) human prostate cancer dataset (n=194). Among these 741 genes,there are a total of 228 genes (77 amp and 151 del) shown to becorrelated with prostate cancer progression. Among these 228 genes,there are a total of 113 (37 amp and 76 del) genes shown to becorrelated with bone metastasis.

FIG. 11. Aggressive spread of G3/4 LSL-mTert PB-Pten/p53 prostate tumorsto spinal bones at 24 weeks of age. (A) H&E sections of the HPIN in theanterior prostate (AP) tumors at age of 9 weeks from G0 mTertPB-Pten/p53 (denoted as G0 mTert), G4 mTert−/− PBPten/p53 (denoted as G4mTert−/−), and G4 LSL-mTert PB-Pten/p53 (denoted as G4 LSLmTert). (B)Prostate tumor cells from the primary sites and from spinal bones of G4LSL-Tert PB-Pten/p53 mouse at 24 weeks of age. (C) Prostate tumor cellsfrom spinal bones of G4 LSL-Tert PB-Pten/p53 mouse were micro-dissected,and the purified genomic DNA was used to detect the genomic status offloxed Pten by PCR (D).

FIG. 12. A high-sensitivity model contains 7 genes of DNAJC15, KIF5B,LECT1, DSG2, ACAA2, ASAP1, LMO7 that aims to minimize false negativerates.

FIG. 13. A high-specificity model contains 10 genes of SVIL, DSC2,PCDH9, SMAD7, WDR7, LAMA3, PCDH8, MKX, MSR1, POLR2K that minimize falsepositive rates.

FIG. 14. The new 17 gene set from the combination of high-sensitivitymodel of DNAJC15, KIF5B, LECT1, DSG2, ACAA2, ASAP1, LMO7 and thehigh-specificity model of SVIL, DSC2, PCDH9, SMAD7, WDR7, LAMA3, PCDH8,MKX, MSR1, POLR2K can significantly dichotomize prostate cancer casesinto low versus high risk groups for BCR in Taylor et al dataset (2010)(Taylor et al., Cancer Cell (2010).

FIG. 15. The new 17 gene set can significantly enhance the previous 4gene signature of PTEN/SMAD4/SPP1/CCND1 (Ding et al., Nature (2011)) todichotomize prostate cancer cases into low versus high risk groups forBCR in Taylor et al dataset (2010) (Taylor et al. Cancer Cell (2010).(A) The previous 4-gene signature (SMAD4/PTEN/CCND1/SPP1) cansignificantly dichotomize prostate cancer cases into low versus highrisk groups for BCR in Taylor et al dataset (2010). (B) The 17 gene set(DNAJC15, KIF5B, LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL, DSC2, PCDH9,SMAD7, WDR7, LAMA3, PCDH8, MKX, MSR1, POLR2K) can significantlydichotomize prostate cancer cases into low versus high risk groups forBCR in Taylor et al dataset (2010). (C) The combined new 17 gene set cansignificantly enhance the sensitivity and specificity ofSMAD4/PTEN/CCND1/SPP1 dichotomize prostate cancer cases into low versushigh risk groups for BCR in Taylor et al dataset (2010).

FIG. 16. The new 17 gene set can significantly enhance the previous 4gene signature of PTEN/SMAD4/SPP1/CCND1 (Ding et al., Nature (2011)) todichotomize prostate cancer cases into low versus high risk groups forBCR in Glinsky et al dataset (2004) (Glinsky et al., J. Clin. Invest(2004)). (A) The previous 4-gene signature (SMAD4/PTEN/CCND1/SPP1) cansignificantly dichotomize prostate cancer cases into low versus highrisk groups for BCR in Glinsky et al dataset (2004). (B) The 17 gene set(DNAJC15, KIF5B, LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL, DSC2, PCDH9,SMAD7, WDR7, LAMA3, PCDH8, MKX, MSR1, POLR2K) can significantlydichotomize prostate cancer cases into low versus high risk groups forBCR in Glinsky et al dataset (2004). (C) The combined new 17 gene setcan significantly enhance the sensitivity and specificity ofSMAD4/PTEN/CCND1/SPP1 dichotomize prostate cancer cases into low versushigh risk groups for BCR in Glinsky et al dataset (2004).

FIG. 17. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0186. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log 2 ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 18. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0193. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log 2 ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 19. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0208. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). They axis shows log₂ ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 20. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0060. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). They axis shows log₂ ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 21. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0088. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log 2 ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis: x axis is chromosome position, in Mbp.

FIG. 22. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0184. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log₂ ofcopy number ratio (normal, log 2=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 23. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0185. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log₂ ofcopy number ratio (normal, log 2=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 24. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0190. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log₂ ofcopy number ratio (normal, log 2=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 25. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0191. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 26. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0192. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log₂ ofcopy number ratio (normal, log 2=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 27. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0195. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log₂ ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 28. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0196. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log₂ ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 29. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0199. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log₂ ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 30. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0207. Log: ratio of array-CGH plots showing conserveddeletion of PTEN (A), TP53 (B), and SMAD4 (C). The y axis shows log₂ ofcopy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 31. Co-deletion analysis of PTEN, TP53 and SMAD4 in human prostatecancer sample PCA0211. Log₂ ratio of array-CGH plots showing conserveddeletion of PTEN (uA), TP53 (B), and SMAD4 (C). The y axis shows log 2of copy number ratio (normal, log₂=0); amplifications are above anddeletions are below this axis; x axis is chromosome position, in Mbp.

FIG. 32. Pathway enrichment analysis of 228 gene set and 513 gene set. Pvalues were adjusted by false detection rate (FDR).

FIG. 33. Pathway enrichment analysis of bone metastasis of 113 gene set.P values were adjusted by false detection rate (FDR). Enrichment ofTGF-beta signaling pathway was highlighted by arrows.

FIG. 34. The new 14 gene set can significantly enhance the previous 4gene signature of PTEN/SMAD4/SPP1/CCND1 (Ding et al., Nature (2011)) todichotomize prostate cancer cases into low versus high risk groups forBCR in Taylor et al dataset (Cancer Cell (2010)). (A) The enrichedsignaling genes in bone mets formatting a 14 gene set(ATP5A1/ATP6V1C1/CUL2/CYC1/DCC/ERCC3/MBD2/MTERF/PARD3/PTK2/RBL2/SMAD2/SMAD4/SMAD7) that can significantly dichotomize prostate cancercases into low versus high risk groups for BCR in Taylor et al dataset(2010). (B) The previous 4-gene signature (SMAD4/PTEN/CCND1/SPP1) cansignificantly dichotomize prostate cancer cases into low versus highrisk groups for BCR in Taylor et al dataset (2010). (C) The combinedgene set can significantly enhance the specificity ofSMAD4/PTEN/CCND1/SPP1 dichotomize prostate cancer cases into low versushigh risk groups for BCR in Taylor et al dataset (2010).

FIG. 35. The new 14 gene set can significantly enhance the previous 4gene signature of PTEN/SMAD4/SPP1/CCND1 (Ding et al., Nature (2011)) todichotomize prostate cancer cases into low versus high risk groups forBCR in Glinsky et al dataset (2004) (Glinsky et al., J. Clin. Invest(2004)). (A) The 14 gene set(ATP5A1/ATP6V1C1/CUL2/CYC1/DCC/ERCC3/MBD2/MTERF/PARD3/PTK2/RBL2/SMAD2/SMAD4/SMAD7) can significantly dichotomize prostate cancer casesinto low versus high risk groups for BCR in Glinsky et al dataset(2004). (B) The previous 4-gene signature (SMAD4/PTEN/CCND1/SPP1) cansignificantly dichotomize prostate cancer cases into low versus highrisk groups for BCR in Glinsky et al dataset (2004). (C) The combinedgene set can significantly enhance the specificity ofSMAD4/PTEN/CCND1/SPP1 dichotomize prostate cancer cases into low versushigh risk groups for BCR in Glinsky et al dataset (2004).

DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to the identification of signaturesassociated with and DETERMINANTS conferring subjects with metastaticprostate cancer or who are at risk for a recurrence of prostate cancer.The invention further provides a murine mouse model for prostate cancer,where the model displays human-like telomere dynamics in a prostatecancer prone mouse model. This mouse model allows the role of telomeredysfunction and telomerase reactivation in shaping both the genomics andbiology of prostate to be elucidated. The mouse model can be used toidentify prostate cancer genes and prognostic biomarkers.

Human cancers harbor innumerable genetic and epigenetic alterationspresenting formidable challenges in deciphering those changes that drivethe malignant process and dictate a given tumor's clinical behavior. Theneed for accurately predictive biomarkers reflective of a tumor'smalignant potential is evident across many cancer types, particularlyprostate cancer, where current management algorithms result in eitherunder-treatment with consequent risk of death or exposure to unnecessarymorbid treatments.

Genetically engineered mouse models have been shown (Sharpless andDePinho, Nat. Rev. Drug Discov. (2006)) to be tremendously powerful as“filters” to mine highly complex genomic datasets in human. Inparticular, these refined genetically engineered mouse models of humancancers have been documented in high-resolution comparative oncogenomicanalyses to harbor substantial overlap in cancer-associatedtranscriptional and chromosomal DNA aberrations patterns—the latterresulting in the rapid and efficient identification of many novel cancergenes. Similar cross-species comparisons of the serum proteome have alsoproven effective in the identification of early detection biomarkers forpancreas cancer in humans.

Human prostate cancer genomes are often highly rearranged, displayingnumerous chromosomal rearrangements and copy number aberrations of knownand potential pathogenetic significance. Telomere dysfunction has beenproposed as a mechanism that drives genomic instability in early stagehuman prostate cancers. In contrast, current mouse models of prostatecancer are notable for minimal cytogenetic aberrations, which may relateto the longer telomeres and more promiscuous expression of telomerase inlaboratory mice. Thus, it stands to reason that development of a validmouse model recapitulating human-like telomere dynamics in a prostatecancer prone mouse model will greatly facilitate our efforts to developprognostic and early detection biomarkers and possible therapeutictargets.

Both conventional knock-out and inducible Lox-Stop-Lox (LSL) knock-inalleles of the mouse telomerase reverse transcriptase (mTert) gene wereutilized in mice engineered to sustain prostate-specific Probasin-Credeletion of the Pten and p53 tumor suppressor genes (PB-Pren/p53). Whiletelomere dysfunction produced by successive mTert null intercrossesconstrained prostate cancer progression in the PB-Pten/p53 model, theinducible telomerase (LSL-mTert) allele which models telomeredysfunction followed by telomerase reactivation showed highly aggressiveprostate cancers with spread to lumbar spine. On the tumor biologicallevel, telomerase reactivation was associated with a dramaticalleviation of telomere checkpoint responses including DNA damage foci,cell cycle arrest and apoptosis. Spectral karyotype and array-CGHanalyses of these models revealed genomic complexity comparable to thatof human prostate cancer with many non-reciprocal translocations andrecurrent amplifications and deletions. These copy number aberrationstargeted regions syntenic to those in human prostate cancer such acomponents of the SMAD signaling pathway, suggesting that human andmouse prostate cancers sustain common somatic events in their evolution.

This invention has established a bona fide genetically engineered mousemodel of human PCA. This model has not only facilitated theidentification of a novel marker set for prostate cancer recurrence inmen but also enables mechanistic studies as well as comparative genomicand proteomic analyses in searches for prognostic and early-detectionbiomarkers.

The data of the invention has demonstrated that the tumor biologicalimpact of SMAD4 inactivation includes increased invasion, increased Sphase entry, and decreased senescence.

The data of this invention has also demonstrated that certain SMAD4direct targets can be useful as biomarkers/DETERMINANTS used in themethods/kits of this invention, including, but not limited to SPP1 andcyclin D1.

The data of this invention also demonstrates that unbiased checkpointscan identifies TGFβ-Smad4 as a major progression barrier in mouse andhuman prostate cancer; that genetic inactivation of PTEN and Smad4generates a metastatic prostate cancer model and that integrativecross-species genomic and functional analysis can identify gene sets ofpotential clinical utility.

The data of this invention further demonstrates that telomeredysfunction in the Pten/p53 PCAs generates complex chromosomalrearrangements (NRTs) and copy number aberrations; that genomics eventsin the mouse PCA target regions altered in the human disease; and thatgenome instability enables acquisition of new biological featuresincluding bony metastasis.

Genomic, biological and mouse modeling studies have identified andvalidated a constellation of genetic and epigenetic events drivingdisease genesis and progression (Shen et al., Genes Dev. (2010)).Genetic studies of human PCA has identified a number of signatureevents, principally among which are PTEN and p27^(Kip1) tumor suppressorinactivation (Li et al., Science (1997), Guo et al, Clin. Cancer Res.(1997), Majumder et al., Cancer Cell (2008), ETS family translocationand dysregulation (Tomlins et al., Science (2005), Rubin, Mod. Pathol.(2008)), as well as many other genetic and/or epigenetic alterationsincluding Nkx3.1, c-Myc, SPINK, and FGFRs (Abate-Shen et al.,Differentiation (2008), Tomlins et al., Cancer Cell (2008), Jenkins etal., Cancer Res. (1997), Acevedo et al., Cell Cycle (2009)). Globalmolecular analyses have also identified an array of potentialrecurrence/metastasis biomarkers such as ECAD (Rubin et al., Hum.Pathol. (2001)), AIPC (Chaib et al., Cancer Res. (2001)), Pim-1 Kinase(Dhanasekaran et al., Nature (2001)), hepsin (Dhanasekaran et al.,Nature (2001)), AMACR (Rubin et al, JAMA (2002)), microRNA mir101(Varambally et al., Science (2008)), the mir101 target EZH2 (Varamballyet al., Nature (2002)), EZH2 target DAB2IP (Min et al., Nat. Med.(2010)), p53 (Chen et al., Nature (2005)), and SMAD4 (Ding et al.,Nature (2011)). Recent copy number profile analysis of a largecollection of human prostate cancers has revealed numerous recurrentlarge and focal amplifications and deletions (Taylor et al., Cancer Cell(2010)), pointing to the existence of many uncharacterizedcancer-relevant genes of potential prognostic and therapeuticsignificance. Identification and ultimate biological and clinicalvalidation of these potential cancer genes are hampered by significantintratumoral cellular and biological heterogeneity of human prostatecancers, paucity of human cell culture model systems, among otherchallenges. In other solid tumor types, integration of genomic data fromgenetically engineered mouse models of human cancer has served as auseful filter to facilitate novel cancer gene discovery (Taylor et al.,Cancer Cell (2010), Kim et al., Cell (2006), Zender et al., Cell(2006)), particularly in genomically unstable models (Maser et al.,Nature (2007)).

Many genome instability mechanisms are thought to contribute to theaccumulation of myriad somatic genetic events present in human cancers,particularly epithelial cancers (DePinho, Nature (2007)). Geneticstudies in the mouse revealed a cooperative role for telomeredysfunction and deactivated p53 in driving epithelial carcinogenesis viaa DNA double-strand breakage process which produces non-reciprocaltranslocations, amplifications and deletions (Artandi et al., Nature(2000), Chin et al., Cell (1999), O'Hagan et al., Cancer Cell (2002)).Telomere dysfunction also appears to drive human epithelial cancers andits associated genomic instability on the basis of coincidental telomereerosion, anaphase bridging, and chromosomal instability in early stagehuman carcinomas of the colon (Rudolph et al., Nat. Genet. (2001)),breast (Chin et al., Nat. Genet. (2004)), pancreas (Feldmann et al., J.Hepatobiliary. Pancreat. Surg. (2007)), and prostate (Meeker et al.,Cancer Res. (2002). Recent whole genome sequencing data has providedadditional evidence that a period of telomere dysfunction serves toshape the genome rearrangement in human carcinomas (Stratton et al.,Nature (2009). In prostate cancer, telomeres are shorter in cancer cellsrelative to adjacent normal tissues (Sommerfeld et al., Cancer Res.(1996)); and telomere erosion appears to occur early in the evolution ofhuman prostate cancer (Meeker et al., Cancer Res. (2002), Vukovic etal., Oncogene (2003)).

While telomere dysfunction serves to drive early stages of cancerdevelopment, numerous mouse and human studies have shown that subsequenttelomerase activation and restoration of telomere function may enablefull malignant potential (Stratton et al., Nature (2009), Hahn et al.,Nat. Med. (1999)), including metastatic capability (Chang et al., GenesDev. (2003)). Accordingly, telomerase activity has been documented to below or undetectable in normal prostate tissues, yet elevated in themajority of prostate tumors (Sommerfeld et al., Cancer Res. (1996),Kallakury et al., Diagn. Mol. Pathol. (1998), Lin et al., J. Urol.(1997), Koeneman et al., J. Urol. (1998), Zhang et al., Cancer Res.(1998)). In this invention, we exploited the experimental merits of themouse to substantiate the role of telomeres and telomerase in prostatecancer genesis versus progression and in the generation oftranslocations, amplifications and deletions. In this mouse model, wefurther assessed whether cancer-relevant loci were targeted by thesechromosomal aberrations and the potential for comparative oncogenomicsto identify novel prostate cancer genes and prognostic biomarkers.

Accordingly, the invention provides an animal model for prostate cancer.The animal model of the instant invention thus finds particular utilityas a screening tool to elucidate the mechanisms of the various genesinvolved in both normal and diseased patient populations.

The invention also provides methods for identifying subjects who havemetastatic prostate cancer, or who at risk for experiencing a prostatecancer recurrence by the detection of DETERMINANTS associated with thetumor, including those subjects who are asymptomatic for the tumor.These signatures and DETERMINANTS are also useful for monitoringsubjects undergoing treatments and therapies for cancer, and forselecting or modifying therapies and treatments that would beefficacious in subjects having cancer, wherein selection and use of suchtreatments and therapies slow the progression of the tumor, orsubstantially delay or prevent its onset, or reduce or prevent theincidence of tumor metastasis and recurrence.

Definitions

“Accuracy” refers to the degree of conformity of a measured orcalculated quantity (a test reported value) to its actual (or true)value. Clinical accuracy relates to the proportion of true outcomes(true positives (TP) or true negatives (TN) versus misclassifiedoutcomes (false positives (FP) or false negatives (FN)), and may bestated as a sensitivity, specificity, positive predictive values (PPV)or negative predictive values (NPV), or as a likelihood, odds ratio,among other measures.

“DETERMINANTS” in the context of the present invention encompasses,without limitation, proteins, nucleic acids, and metabolites, togetherwith their polymorphisms, mutations, variants, modifications, subunits,fragments, protein-ligand complexes, and degradation products,protein-ligand complexes, elements, related metabolites, and otheranalytes or sample-derived measures. DETERMINANTS can also includemutated proteins or mutated nucleic acids. DETERMINANTS also encompassnon-blood borne factors or non-analyte physiological markers of healthstatus, such as “clinical parameters” defined herein, as well as“traditional laboratory risk factors”, also defined herein. DETERMINANTSalso include any calculated indices created mathematically orcombinations of any one or more of the foregoing measurements, includingtemporal trends and differences. Where available, and unless otherwisedescribed herein. DETERMINANTS which are gene products are identifiedbased on the official letter abbreviation or gene symbol assigned by theinternational Human Genome Organization Naming Committee (HGNC) andlisted at the date of this filing at the US National Center forBiotechnology Information (NCBI) web site(http://www.ncbi.nlm.nih.gov/sites/entrez?dbgene), also known as EntrezGene.

“DETERMINANT” OR “DETERMINANTS” encompass one or more of all nucleicacids or polypeptides whose levels are changed in subjects who haveprostate cancer or are predisposed to developing metastatic prostatecancer, or at risk of a recurrence of prostate cancer. IndividualDETERMINANTS are summarized in Table 2 and are collectively referred toherein as, inter alia, “prostate cancer-associated proteins”,“DETERMINANT polypeptides”, or “DETERMINANT proteins”. The correspondingnucleic acids encoding the polypeptides are referred to as “prostatecancer-associated nucleic acids”, “prostate cancer-associated genes”,“DETERMINANT nucleic acids”, or “DETERMINANT genes”. Unless indicatedotherwise, “DETERMINANT”, “prostate cancer-associated proteins”,“prostate cancer-associated nucleic acids” are meant to refer to any ofthe sequences disclosed herein. The corresponding metabolites of theDETERMINANT proteins or nucleic acids can also be measured, as well asany of the aforementioned traditional risk marker metabolites.

A DETERMINANT may be implicated in cancer progression or have oncogenicactivity. For example, MTDH promotes metastasis & chemoresistance (Hu etal., 2009) and activates AKT (Kikuno et al., 2007); PTK2/FAK is known todrive cell motility and proliferation (Chang et al., 2007) and canpromote human prostate cancer cell invasiveness (Johnson et al., 2008);EBAG9 promotes metastasis in 4T1 breast cancer model (Hong et al., 2009)and is associated with high Gleason score and recurrence (Takahashi etal., 2003); YWHAZ/14-3-3c is amplified in H&N SCC and possessesoncogenic activity (Lin et al., 2009); AKAP9 is oncogenic in thyroidpapillary carcinoma via fusion to BRAF (Ciampi et al., 2005) andamplified in metastatic melanoma (Kabbarah et al., 2010): MTUSI hastumor suppressive activity in breast cancer (Rodrigues-Ferreira et al.,2009); DCC is a potential tumor suppressor gene in colon cancer (Mehlenet al., 1998) and putative metastasis suppressor gene (Rodrigues et al.,2007); APC and Smad2/Smad4 were deleted in up to 20% of human PCA,highlighting the importance of Wnt activation and deactivation of TGFβpathway (Ding et al., Nature 2011).

Physiological markers of health status (e.g., such as age, familyhistory, and other measurements commonly used as traditional riskfactors) are referred to as “DETERMINANT physiology”. Calculated indicescreated from mathematically combining measurements of one or more,preferably two or more of the aforementioned classes of DETERMINANTS arereferred to as “DETERMINANT indices”.

“Clinical parameters” encompasses all non-sample or non-analytebiomarkers of subject health status or other characteristics, such as,without limitation, age (Age), ethnicity (RACE), gender (Sex), or familyhistory (FamHX).

“Circulating endothelial cell” (“CEC”) is an endothelial cell from theinner wall of blood vessels which sheds into the bloodstream undercertain circumstances, including inflammation, and contributes to theformation of new vasculature associated with cancer pathogenesis. CECsmay be useful as a marker of tumor progression and/or response toantiangiogenic therapy.

“Circulating tumor cell” (“CTC”) is a tumor cell of epithelial originwhich is shed from the primary tumor upon metastasis, and enters thecirculation. The number of circulating tumor cells in peripheral bloodis associated with prognosis in patients with metastatic cancer. Thesecells can be separated and quantified using immunologic methods thatdetect epithelial cells, and their expression of PCA progressionDETERMINANTS can be quantified by qRT-PCR, irununofluorescence, or otherapproaches.

“FN” is false negative, which for a disease state test means classifyinga disease subject incorrectly as non-disease or normal.

“FP” is false positive, which for a disease state test means classifyinga normal subject incorrectly as having disease.

A “formula,” “algorithm,” or “model” is any mathematical equation,algorithmic, analytical or programmed process, or statistical techniquethat takes one or more continuous or categorical inputs (herein called“parameters”) and calculates an output value, sometimes referred to asan “index” or “index value.” Non-limiting examples of “formulas” includesums, ratios, and regression operators, such as coefficients orexponents, biomarker value transformations and normalizations(including, without limitation, those normalization schemes based onclinical parameters, such as gender, age, or ethnicity), rules andguidelines, statistical classification models, and neural networkstrained on historical populations. Of particular use in combiningDETERMINANTS and other DETERMINANTS are linear and non-linear equationsand statistical classification analyses to determine the relationshipbetween levels of DETERMINANTS detected in a subject sample and thesubject's risk of metastatic disease. In panel and combinationconstruction, of particular interest are structural and synacticstatistical classification algorithms, and methods of risk indexconstruction, utilizing pattern recognition features, includingestablished techniques such as cross-correlation, Principal ComponentsAnalysis (PCA), factor rotation, Logistic Regression (LogReg), LinearDiscriminant Analysis (LDA), Eigengcne Linear Discriminant Analysis(ELDA), Support Vector Machines (SVM), Random Forest (RF), RecursivePartitioning Tree (RPART), as well as other related decision treeclassification techniques, Shrunken Centroids (SC), StepAIC, Kth-NearestNeighbor, Boosting, Decision Trees, Neural Networks, Bayesian Networks,Support Vector Machines, and Hidden Markov Models, among others. Othertechniques may be used in survival and time to event hazard analysis,including Cox, Weibull, Kaplan-Meier and Greenwood models well known tothose of skill in the art. Many of these techniques are useful eithercombined with a DETERMINANT selection technique, such as forwardselection, backwards selection, or stepwise selection, completeenumeration of all potential panels of a given size, genetic algorithms,or they may themselves include biomarker selection methodologies intheir own technique. These may be coupled with information criteria,such as Akaike's Information Criterion (AIC) or Bayes InformationCriterion (BIC), in order to quantify the tradeoff between additionalbiomarkers and model improvement, and to aid in minimizing overfit. Theresulting predictive models may be validated in other studies, orcross-validated in the study they were originally trained in, using suchtechniques as Bootstrap, Leave-One-Out (LOO) and 10-Foldcross-validation (10-Fold CV). At various steps, false discovery ratesmay be estimated by value permutation according to techniques known inthe art. A “health economic utility function” is a formula that isderived from a combination of the expected probability of a range ofclinical outcomes in an idealized applicable patient population, bothbefore and after the introduction of a diagnostic or therapeuticintervention into the standard of care. It encompasses estimates of theaccuracy, effectiveness and performance characteristics of suchintervention, and a cost and/or value measurement (a utility) associatedwith each outcome, which may be derived from actual health system costsof care (services, supplies, devices and drugs, etc.) and/or as anestimated acceptable value per quality adjusted life year (QALY)resulting in each outcome. The sum, across all predicted outcomes, ofthe product of the predicted population size for an outcome multipliedby the respective outcome's expected utility is the total healtheconomic utility of a given standard of care. The difference between (i)the total health economic utility calculated for the standard of carewith the intervention versus (ii) the total health economic utility forthe standard of care without the intervention results in an overallmeasure of the health economic cost or value of the intervention. Thismay itself be divided amongst the entire patient group being analyzed(or solely amongst the intervention group) to arrive at a cost per unitintervention, and to guide such decisions as market positioning,pricing, and assumptions of health system acceptance. Such healtheconomic utility functions are commonly used to compare thecost-effectiveness of the intervention, but may also be transformed toestimate the acceptable value per QALY the health care system is willingto pay, or the acceptable cost-effective clinical performancecharacteristics required of a new intervention.

For diagnostic (or prognostic) interventions of the invention, as eachoutcome (which in a disease classifying diagnostic test may be a TP, FP,TN, or FN) bears a different cost, a health economic utility functionmay preferentially favor sensitivity over specificity, or PPV over NPVbased on the clinical situation and individual outcome costs and value,and thus provides another measure of health economic performance andvalue which may be different from more direct clinical or analyticalperformance measures. These different measurements and relativetrade-offs generally will converge only in the case of a perfect test,with zero error rate (a.k.a., zero predicted subject outcomemisclassifications or FP and FN), which all performance measures willfavor over imperfection, but to differing degrees.

“Measuring” or “measurement,” or alternatively “detecting” or“detection,” means assessing the presence, absence, quantity or amount(which can be an effective amount) of either a given substance within aclinical or subject-derived sample, including the derivation ofqualitative or quantitative concentration levels of such substances, orotherwise evaluating the values or categorization of a subject'snon-analyte clinical parameters.

“Negative predictive value” or “NPV” is calculated by TN/(TN+FN) or thetrue negative fraction of all negative test results. It also isinherently impacted by the prevalence of the disease and pre-testprobability of the population intended to be tested.

See, e.g., O'Marcaigh et al., Clin. Ped. (1993), which discussesspecificity, sensitivity, and positive and negative predictive values ofa test, e.g., a clinical diagnostic test. Often, for binary diseasestate classification approaches using a continuous diagnostic testmeasurement, the sensitivity and specificity is summarized by ReceiverOperating Characteristics (ROC) curves according to Pepe et al, Am. J.Epidemiol (2004), and summarized by the Area Under the Curve (AUC) orc-statistic, an indicator that allows representation of the sensitivityand specificity of a test, assay, or method over the entire range oftest (or assay) cut points with just a single value. See also, e.g.,Shultz, “Clinical Interpretation Of Laboratory Procedures,” chapter 14in Teitz, Fundamentals of Clinical Chemistry, Burtis and Ashwood (eds.),4^(th) edition 1996, W.B. Saunders Company, pages 192-199; and Zweig etal., Clin. Chem., (1992). An alternative approach using likelihoodfunctions, odds ratios, information theory, predictive values,calibration (including goodness-of-fit), and reclassificationmeasurements is summarized according to Cook, Circulation (2007).

Finally, hazard ratios and absolute and relative risk ratios withinsubject cohorts defined by a test are a further measurement of clinicalaccuracy and utility. Multiple methods are frequently used to definingabnormal or disease values, including reference limits, discriminationlimits, and risk thresholds.

“Analytical accuracy” refers to the reproducibility and predictabilityof the measurement process itself, and may be summarized in suchmeasurements as coefficients of variation, and tests of concordance andcalibration of the same samples or controls with different times, users,equipment and/or reagents. These and other considerations in evaluatingnew biomarkers are also summarized in Vasan, 2006.

“Performance” is a term that relates to the overall usefulness andquality of a diagnostic or prognostic test, including, among others,clinical and analytical accuracy, other analytical and processcharacteristics, such as use characteristics (e.g., stability, ease ofuse), health economic value, and relative costs of components of thetest. Any of these factors may be the source of superior performance andthus usefulness of the test, and may be measured by appropriate“performance metrics,” such as AUC, time to result, shelf life, etc. asrelevant.

“Positive predictive value” or “PPV” is calculated by TP/(TP+FP) or thetrue positive fraction of all positive test results. It is inherentlyimpacted by the prevalence of the disease and pre-test probability ofthe population intended to be tested.

“Risk” in the context of the present invention, relates to theprobability that an event will occur over a specific time period, as inthe conversion to metastatic events, and can mean a subject's “absolute”risk or “relative” risk. Absolute risk can be measured with reference toeither actual observation post-measurement for the relevant time cohort,or with reference to index values developed from statistically validhistorical cohorts that have been followed for the relevant time period.Relative risk refers to the ratio of absolute risks of a subjectcompared either to the absolute risks of low risk cohorts or an averagepopulation risk, which can vary by how clinical risk factors areassessed. Odds ratios, the proportion of positive events to negativeevents for a given test result, are also commonly used (odds areaccording to the formula p/(1−p) where p is the probability of event and(1−p) is the probability of no event) to no-conversion.

“Risk evaluation,” or “evaluation of risk” in the context of the presentinvention encompasses making a prediction of the probability, odds, orlikelihood that an event or disease state may occur, the rate ofoccurrence of the event or conversion from one disease state to another,i.e., from a primary tumor to metastatic prostate cancer or to one atrisk of developing a metastatic, or from at risk of a primary metastaticevent to a more secondary metastatic event. Risk evaluation can alsocomprise prediction of future clinical parameters, traditionallaboratory risk factor values, or other indices of cancer, either inabsolute or relative terms in reference to a previously measuredpopulation. The methods of the present invention may be used to makecontinuous or categorical measurements of the risk of metastaticprostate cancer thus diagnosing and defining the risk spectrum of acategory of subjects defined as being at risk for prostate cancer. Inthe categorical scenario, the invention can be used to discriminatebetween normal and other subject cohorts at higher risk for prostatecancers. Such differing use may require different DETERMINANTcombinations and individualized panels, mathematical algorithms, and/orcut-off points, but be subject to the same aforementioned measurementsof accuracy and performance for the respective intended use.

A “sample” in the context of the present invention is a biologicalsample isolated from a subject and can include, by way of example andnot limitation, tissue biopsies, whole blood, serum, plasma, bloodcells, endothelial cells, circulating tumor cells, lymphatic fluid,ascites fluid, interstitial fluid (also known as “extracellular fluid”and encompasses the fluid found in spaces between cells, including,inter alia, gingival cevicular fluid), bone marrow, cerebrospinal fluid(CSF), saliva, mucous, sputum, sweat, urine, or any other secretion,excretion, or other bodily fluids.

Signature is an expression pattern of more than one DETERMINANT.

“Sensitivity” is calculated by TP/(TP+FN) or the true positive fractionof disease subjects.

“Specificity” is calculated by TN/(TN+FP) or the true negative fractionof non-disease or normal subjects.

By “statistically significant”, it is meant that the alteration isgreater than what might be expected to happen by chance alone (whichcould be a “false positive”). Statistical significance can be determinedby any method known in the art. Commonly used measures of significanceinclude the p-value, which presents the probability of obtaining aresult at least as extreme as a given data point, assuming the datapoint was the result of chance alone. A result is often consideredhighly significant at a p-value of 0.05 or less.

A “subject” in the context of the present invention is preferably amammal. The mammal can be a human, non-human primate, mouse, rat, dog,cat, horse, or cow, but are not limited to these examples. Mammals otherthan humans can be advantageously used as subjects that represent animalmodels of tumor metastasis. A subject can be male or female. A subjectcan be one who has been previously diagnosed or identified as havingprimary tumor or a prostate cancer, and optionally has alreadyundergone, or is undergoing, a therapeutic intervention for the tumor.Alternatively, a subject can also be one who has not been previouslydiagnosed as having metastatic prostate cancer. For example, a subjectcan be one who exhibits one or more risk factors for metastatic prostatecancer or prostate cancer recurrence.

“TN” is true negative, which for a disease state test means classifyinga non-disease or normal subject correctly.

“TP” is true positive, which for a disease state test means correctlyclassifying a disease subject.

“Traditional laboratory risk factors” correspond to biomarkers isolatedor derived from subject samples and which are currently evaluated in theclinical laboratory and used in traditional global risk assessmentalgorithms. Traditional laboratory risk factors for tumor metastasisinclude for example Gleason score, depth of invasion, vessel density,proliferative index, etc. Other traditional laboratory risk factors fortumor metastasis are known to those skilled in the art.

Methods and Uses of the Invention

The invention provides a method for predicting prognosis of a cancerpatient. In this method, one obtains a tissue sample from the patient,and measures the levels of two or more DETERMINANTS selected from Table2, 3 or 5 (see below) in the sample, wherein the measured levels areindicative of the prognosis of the cancer patient. In some embodiments,the levels of two, three, four, five, six, seven, eight, nine, ten,fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty,thirty, forty, fifty, or more DETERMINANTS are measured.

In some embodiments, the prognosis may be that the patient is at a lowrisk of having metastatic cancer or recurrence of cancer. In otherembodiments, the prognosis may be that the patient is at a high risk ofhaving metastatic cancer or recurrence of cancer. In these embodiments,the patient may have melanoma, breast cancer, prostate cancer, or coloncancer.

In some embodiments, an increased risk of cancer recurrence ordeveloping metastatic cancer in the patient is determined by measuring aclinically significant alteration in the levels of the selectedDETERMINANTS in the sample. Alternatively, an increased risk ofdeveloping metastatic prostate cancer in the patient is determined bycomparing the levels of the selected DETERMINANTS to a reference value.In some embodiments, the reference value is an index.

The invention also provides a method for analyzing a tissue sample froma cancer patient. In this method, one obtains the tissue sample from thepatient, measures the levels of two or more DETERMINANTS selected fromTable 2, 3 or 5 in the sample.

This invention additionally provides a method for identifying a cancerpatient in need of adjuvant therapy. In this method, one obtains atissue sample from the patient, measures the levels of two or moreDETERMINANTS selected from Table 2, 3 or 5 in the sample, wherein themeasured levels indicate that the patient is in need of adjuvanttherapy. For example, the adjuvant therapy may be selected from thegroup consisting of radiation therapy, chemotherapy, immunotherapy,hormone therapy, and targeted therapy. In some embodiments, the patienthas been subjected to a standard of care therapy. In some embodiments,the targeted therapy targets another component of a signaling pathway inwhich one or more of the selected DETERMINANTS is a component. Inalternative embodiments, the targeted therapy targets one or more of theselected DETERMINANTS.

This invention also provides a further method for treating a cancerpatient. In this method, one measures the levels of two or moreDETERMINANTS selected from Table 2, 3 or 5 in a tissue sample from thepatient, and treats the patient with adjuvant therapy if the measuredlevels indicate that the patient is at a high risk of having metastaticcancer or recurrence of cancer. In some embodiments, the adjuvanttherapy is an experimental therapy.

This invention additionally provides a method for monitoring theprogression of a tumor in a patient. In this method, one obtains a tumortissue sample from the patient; and measures the levels of two or moreDETERMINANTS selected from Table 2, 3 or 5 in the sample, and whereinthe measured levels are indicative of the progression of the tumor inthe patient. In some embodiments, a clinically significant alteration inthe measured levels between the tumor tissue sample taken form thepatient at two different time points is indicative of the progression ofthe tumor in the patient. In some embodiments, the progression of atumor in a patient is measured by detecting the levels of the selectedDETERMINANTS in a first sample from the patient taken at a first periodof time, detecting the levels of the selected DETERMINANTS in a secondsample from the patient taken at a second period of time and thencomparing the levels of the selected DETERMINANTS to a reference value.In some aspects, the first sample is taken from the patient prior tobeing treated for the tumor and the second sample is taken from thepatient after being treated for the tumor.

The invention also provides a method for monitoring the effectiveness oftreatment or selecting a treatment regimen for a recurrent or metastaticcancer in a patient by measuring the levels of two or more DETERMINANTSselected from Table 2, 3, or 5 in a first sample from the patient takenat a first period of time and optionally measuring the level of theselected DETERMINANTS in a second sample from the patient taken at asecond period of time. The levels of the selected DETERMINANTS detectedat the first period of time are compared to the levels detected at thesecond period of time or alternatively a reference value. Theeffectiveness of treatment is monitored by a change in the measuredlevels of the selected DETERMINANTS from the patient.

The invention also provides a kit for measuring the levels of two ormore DETERMINANTS selected from Table 2, 3 or 5. The kit comprisesreagents for specifically measuring the levels of the selectedDETERMINANTS. In some embodiments, the reagents are nucleic acidmolecules. In these embodiments, the nucleic acid molecules are PCRprimers or hybridizing probes. In alternative embodiments, the reagentsare antibodies or fragments thereof, oligonucleotides, or aptamers.

This invention also provides a method for treating a cancer patient inneed thereof. In this method, one measures the level of a DETERMINANTselected from Table 2, 3 or 5, and administers an agent that modulatesthe level of the selected DETERMINANT. In some embodiments, theadministered agent may be a small molecule modulator. In someembodiments, the administered agent may be a small molecule inhibitor.In some embodiments, the administered agent may be, for example, siRNAor an antibody or fragment thereof. In some embodiments, the selectedDETERMINANT is AGPAT6, ATAD2, ATP6V1C1, AZIN1, COX6C, CPNE3, DPYS,EBAG9, EFR3A, EXT1, GRINA, HRSPI2, KIAA0196, MAL2, MTDH, NSMCE2, NUDCD1,PDE7A, POLR2K, POP1, PTK2, SPAG1, SQLE, SR1, STK3, TAF2, TGS1, TMEM65,TMEM68, TOP1MT, UBR5. WDYHV1, WWP1, or YWHAZ. In some embodiments, theselected DETERMINANT is AGPAT6, AZIN1, CPNE3, DPYS, NSMCE2, NUDCD1, SR1,TGS1, UBR5, or WDYHV1.

This invention also provides a method of identifying a compound capableof reducing the risk of cancer recurrence or development of metastaticcancer. In this method, one provides a cell expressing a DETERMINANTselected from Table 2, 3 or 5, contacts the cell with a candidatecompound, and determines whether the candidate compound alters theexpression or activity of the selected DETERMINANT, whereby thealteration observed in the presence of the compound indicates that thecompound is capable of reducing the risk of cancer recurrence ordevelopment of metastatic cancer.

This invention also provides a method of identifying a compound capableof treating cancer. In this method, one provides a cell expressing aDETERMINANT selected from Table 2, 3 or 5, contacts the cell with acandidate compound, and determines whether the candidate compound altersthe expression or activity of the selected biomarker, whereby thealteration observed in the presence of the compound indicates that thecompound is capable of treating cancer.

This invention also provides a method of identifying a compound capableof reducing the risk of cancer occurrence or development of cancer. Inthis method, one provides a cell expressing a DETERMINANT selected fromTable 2, 3 or 5, contacts the cell with a candidate compound, anddetermines whether the candidate compound alters the expression oractivity of the selected biomarker, whereby the alteration observed inthe presence of the compound indicates that the compound is capable ofreducing the risk of cancer occurrence or development of cancer.

According to the invention, a DETERMINANT that can be used in themethods or kits provided by the invention may be selected from theDETERMINANTS listed on Table 2, Table 3, or Table 5. In someembodiments, the selected DETERMINANTS may comprise one or more ofMAP3K8, RAD21, and TUSC3. In some embodiments, the selected DETERMINANTSmay comprise one or more of ATP5A1, ATP6V1C1, CUL2, CYC1, DCC, ERCC3,MBD2, MTERF, PARD3, PTK2, RBL2, SMAD2, SMAD4, SMAD7, DNAJC15, KIF5B,LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL, DSC2, PCDH9, WDR7, LAMA3, PCDH8,MKX, MSR1, and POLR2K. In some embodiments, the selected DETERMINANTSmay comprise one or more of ATP5A1, ATP6VC1, CUL2, CYC1, DCC, ERCC3,MBD2, MTERF, PARD3, PTK2, RBL2, SMAD2, SMAD4, and SMAD7. In someembodiments, the selected DETERMINANTS may comprise one or more ofDNAJC15, KIF5B, LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL, DSC2, PCDH9,SMAD7, WDR7, LAMA3, PCDH8, MKX, MSR1, and POLR2K. In some embodiments,the selected DETERMINANTS comprise one or more of DNAJC15, KIF5B, LECT1,DSG2, ACAA2, ASAP1, and LMO7. In some embodiments, the selectedDETERMINANTS comprise one or more of SVIL, DSC2, PCDH9, SMAD7, WDR7,LAMA3, PCDH8, MKX, MSR1, and POLR2K. In various embodiments, the methodsor the kits provided by the invention further comprise measuring thelevels of one or more of PTEN, cyclin D1, SMAD4, and SPP1. In someembodiments, the selected DETERMINANTS comprise two or more of ATP5A1,ATP6V1C1, CUL2, CYC1, DCC, ERCC3, MBD2, MTERF, PARD3, PTK2, RBL2, SMAD2,SMAD4, SMAD7, DNAJC15, KIF5B, LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL,DSC2, PCDH9, WDR7, LAMA3, PCDH8, MKX, MSR1, POLR2K, PTEN, cyclin D1 andSPP1. See, for example, Table 7 for two-DETERMINANT combination.

In another aspect of the invention, the selected DETERMINANTS areassociated with DNA gain or the selected DETERMINANTS have a clinicallysignificant increase in the measured levels. For example, theDETERMINANTS are selected from 1) the group consisting of DETERMINANTS1-300 of Table 2; or 2) the group consisting of DETERMINANTS 8, 9, 12,13, 18-20, 22, 23, 34, 41, 48-50, 56, 64, 65, 70, 72-79, 81, 84, 87, 88,102, 104, 114, 124, 134, 139, 154, 169-172, 185, 186, 193, 196, 199,203, 204, 207, 209, 212, 217, 218, 221, 245, 247, 248, 254, 257, 260,263, 264, 268, 269, 277, 279-281, 283, 284, 286, 287, 292, 294, and296-298 of Table 3; or 3) the group consisting of DETERMINANTS 9, 12,18, 19, 22, 23, 41, 56, 70, 72, 75, 76, 77, 87, 102, 114, 134, 139, 171,172, 185, 196, 199, 207, 209, 217, 221, 247, 248, 260, 263, 264, 268,269, 279, 296, and 298 of Table 5.

In another aspect of the invention, the selected DETERMINANTS areassociated with DNA loss or the selected DETERMINANTS have a clinicallysignificant decrease in the measured levels. For example, theDETERMINANTS are selected from 1) the group consisting of DETERMINANTS301-741 of Table 2; 2) DETERMINANTS 303, 308, 310, 312, 313, 316, 319,322, 324, 326, 328, 329, 343, 353, 360, 368, 371, 376, 378, 384, 386,389, 391, 392, 398, 400, 403, 404, 405, 406, 407, 412, 416, 421, 422,424, 430, 432, 435, 437, 440, 445, 446, 451, 459, 466, 468, 469, 470,471, 473, 477, 481, 482, 484, 485, 486, 487, 490, 492, 493, 494, 496,498, 502, 503, 505, 506, 509, 514, 515, 520, 521, 522, 525, 526, 527,530, 533, 534, 536, 542, 547, 552, 553, 554, 555, 563, 570, 580, 582,584, 585, 586, 587, 589, 592, 594, 595, 599, 600, 603, 604, 612, 614,616, 617, 622, 625, 628, 629, 632, 637, 642, 648, 651, 652, 654, 655,656, 658, 659, 660, 661, 662, 666, 669, 670, 671, 675, 676, 680, 681,682, 685, 687, 689, 690, 691, 692, 695, 711, 718, 719, 722, 723, 724,725, 730, 735, and 740 of Table 3; or 3) DETERMINANTS 308, 312, 319,322, 324, 326, 328, 329, 343, 371, 378, 386, 389, 391, 392, 400, 416,422, 424, 440, 445, 466, 471, 481, 482, 484, 490, 492, 493, 494, 496,498, 503, 505, 506, 514, 521, 522, 525, 526, 527, 533, 542, 554, 555,570, 582, 584, 585, 586, 592, 594, 595, 612, 617, 628, 629, 637, 642,648, 651, 658, 659, 660, 661, 675, 680, 681, 685, 687, 692, 718, 719,723, 730, and 735 of Table 5.

In some embodiments, at least one of the selected DETERMINANTS isassociated with REACTOME TGF-beta signaling pathway. In someembodiments, at least one of the selected DETERMINANTS is associatedwith BIOCARTA TGF-beta signaling pathway. In some embodiments, at leastone of the selected DETERMINANTS is associated with KEGG TGF-betasignaling pathway. In some embodiments, at least one of the selectedDETERMINANTS is associated with KEGG colorectal cancer pathway. In someembodiments, at least one of the selected DETERMINANTS is associatedwith KEGG Adherens junction pathway. In some embodiments, at least oneof the selected DETERMINANTS is associated with REACTOME RNA PolymeraseI/III and mitochondrial transcription pathway. In some embodiments, atleast one of the selected DETERMINANTS is associated with KEGG cellcycle pathway. In some embodiments, at least one of the selectedDETERMINANTS is associated with KEGG oxidative phosphorylation pathway.In some embodiments, at least one of the selected DETERMINANTS isassociated with KEGG pathways in cancer.

In another aspect of the invention, at least one of the selectedDETERMINANTS is associated with DNA gain and at least one of theselected DETERMINANTS is associated with DNA loss. In some embodiments,at least one of the selected DETERMINANTS has a clinically significantincrease in the measured levels and at least one of the selectedDETERMINANTS has a clinically significant decrease in the measuredlevels.

The levels of the selected DETERMINANTS may be measuredelectrophoretically or immunochemically. For example, the levels of theselected DETERMINANTS are detected by radioimmunoassay,immunofluorcscence assay or by an enzyme-linked immunosorbent assay.Optionally, the DETERMINANTS are detected using non-invasive imagingtechnology.

In some embodiments, the levels of the selected DETERMINANTS aredetermined based on the DNA copy number alteration. In theseembodiments, the DNA copy number alteration of the selected DETERMINANTindicates DNA gain or loss. In some embodiments, the RNA transcriptlevels of the selected DETERMINANTS are measured. In certainembodiments, the RNA transcript levels may be determined by microarray,quantitative RT-PCR, sequencing, nCounter® multiparameter quantitativedetection assay (NanoString), branched DNA assay (e.g., PanomicsQuantiGene® Plex technology), or quantitative nuclease protection assay(e.g., Highthroughput Genomics qNPA™). nCounter® system is developed byNanoString Technology. It is based on direct multiplexed measurement ofgene expression and capable of providing high levels of precision andsensitivity (<1 copy per cell) (see72.5.117.165/applications/technology/). In particular, the nCounter®assay uses molecular “barcodes” and single molecule imaging to detectand count hundreds of unique transcripts in a single reaction. PanomicsQuantiGene® Plex technology can also be used to assess the RNAexpression of DETERMINANTS in this invention. The QuantiGene® platformis based on the branched DNA technology, a sandwich nucleic acidhybridization assay that provides a unique approach for RNA detectionand quantification by amplifying the reporter signal rather than thesequence (Flagella et al., Analytical Biochemistry (2006)). It canreliably measure quantitatively RNA expression in fresh, frozen orformalin-fixed, paraffin-embedded (FFPE) tissue homogenates (Knudsen etal., Journal of Molecular Diagnostics (2008)). In some embodiments, theprotein levels of the selected DETERMINANTS are measured. In certainembodiments, the protein levels may be measured, for example, byantibodies, immunohistochemistry or immunofluorescence. In theseembodiments, the protein levels may be measured in subcellularcompartments, for example, by measuring the protein levels ofDETERMINANTS in the nucleus relative to the protein levels of theDETERMINANTS in the cytoplasm. In some embodiments, the protein levelsof DETERMINANTS may be measured in the nucleus and/or in the cytoplasm.

In some embodiments, the levels of the DETERMINANTS may be measuredseparately. Alternatively, the levels of the DETERMINANTS may bemeasured in a multiplex reaction.

In some embodiments, the noncancerous cells are excluded from the tissuesample. In some embodiments, the tissue sample is a solid tissue sample,a bodily fluid sample, or circulating tumor cells. In some embodiments,the bodily fluid sample may be blood, plasma, urine, saliva, lymphfluid, cerebrospinal fluid (CSF), synovial fluid, cystic fluid, ascites,pleural effusion, interstitial fluid, or ocular fluid. In someembodiments, the solid tissue sample may be a formalin-fixed paraffinembedded tissue sample, a snap-frozen tissue sample, an ethanol-fixedtissue sample, a tissue sample fixed with an organic solvent, a tissuesample fixed with plastic or epoxy, a cross-linked tissue sample,surgically removed tumor tissue, or a biopsy sample (e.g., a corebiopsy, an excisional tissue biopsy, or an incisional tissue biopsy). Insome embodiments, the tissue sample is a cancerous tissue sample. Insome embodiments, the cancerous tissue is melanoma, prostate cancer,breast cancer, or colon cancer tissue.

In some embodiments, at least one standard parameter associated with thecancer is measured in addition to the measured levels of the selectedDETERMINANTS. The at least one standard parameter may be, for example,tumor stage, tumor grade, tumor size, tumor visual characteristics,tumor location, tumor growth, lymph node status, tumor thickness(Breslow score), ulceration, age of onset, PSA level, PSA kinetics, orGleason score.

In some embodiments, the patient may have a primary tumor, a recurrenttumor, or metastatic prostate cancer.

Also included in the invention is metastatic prostate cancer referenceexpression profile containing a pattern of marker levels of an effectiveamount of two or more markers selected from Tables 2, 3 or 5. Alsoincluded is a machine readable media containing one or more metastatictumor reference expression profiles and optionally, additional testresults and subject information. In a further aspect the inventionprovides a DETERMINANT panel containing one or more DETERMINANTS thatare indicative of a physiological or biochemical pathway associatedmetastasis or the progression of a tumor.

The invention also provides a mouse wherein the genome of at least oneprostate epithelial cell contains a homozygous inactivation of theendogenous PTEN gene, p53 gene, and TERT gene, and the TERT gene can beinducibly re-activated and therefore expressed, and wherein the mouseexhibits an increased susceptibility to development of metastaticprostate cancer upon expression of the TERT gene.

The invention also provides a mouse wherein the genome of at least oneprostate epithelial cell contains a homozygous inactivation of theendogenous PTEN gene, p53 gene, and SMAD4 gene, and wherein the mouseexhibits an increased susceptibility to development of metastaticprostate cancer. The invention also provides cells from the mouse modelsand in some aspects, such cells are epithelial cancer cells.

The methods disclosed herein are used with subjects at risk fordeveloping metastatic prostate cancer, a prostate cancer recurrence orother cancer subjects, such as those with breast cancer who may or maynot have already been diagnosed with cancer or other cancer types andsubjects undergoing treatment and/or therapies for a primary tumor ormetastatic prostate cancer and other cancer types. The methods of thepresent invention can also be used to monitor or select a treatmentregimen for a subject who has a primary tumor or metastatic prostatecancer and other cancer types, and to screen subjects who have not beenpreviously diagnosed as having metastatic prostate cancer and othercancer types, such as subjects who exhibit risk factors for metastasisor reoccurrence. Preferably, the methods of the present invention areused to identify and/or diagnose subjects who are asymptomatic formetastatic tumor prostate cancer and other cancer types. “Asymptomatic”means not exhibiting the traditional signs and symptoms.

The methods of the present invention may also used to identify and/ordiagnose subjects already at higher risk of developing metastaticprostate cancer or prostate cancer recurrence and other metastaticcancer types based on solely on the traditional risk factors.

A subject having metastatic prostate cancer and other metastatic cancertypes can be identified by measuring the amounts (including the presenceor absence) of an effective number (which can be two or more) ofDETERMINANTS in a subject-derived sample and the amounts are thencompared to a reference value. Alterations in the amounts and patternsof expression of biomarkers, such as proteins, polypeptides, nucleicacids and polynucleotides, polymorphisms of proteins, polypeptides,nucleic acids, and polynucleotides, mutated proteins, polypeptides,nucleic acids, and polynucleotides, or alterations in the molecularquantities of metabolites or other analytes in the subject samplecompared to the reference value are then identified.

A reference value can be relative to a number or value derived frompopulation studies, including without limitation, such subjects havingthe same cancer, subject having the same or similar age range, subjectsin the same or similar ethnic group, subjects having family histories ofcancer, or relative to the starting sample of a subject undergoingtreatment for a cancer. Such reference values can be derived fromstatistical analyses and/or risk prediction data of populations obtainedfrom mathematical algorithms and computed indices of cancer metastasis.Reference DETERMINANT indices can also be constructed and used usingalgorithms and other methods of statistical and structuralclassification.

In one embodiment of the present invention, the reference value is theamount (of DETERMINANTS in a control sample derived from one or moresubjects who are not at risk or at low risk for developing prostatecancer. In another embodiment of the present invention, the referencevalue is the amount of DETERMINANTS in a control sample derived from oneor more subjects who are asymptomatic and/or lack traditional riskfactors for metastatic prostate cancer. In a further embodiment, suchsubjects are monitored and/or periodically retested for a diagnosticallyrelevant period of time (“longitudinal studies”) following such test toverify continued absence of metastatic prostate cancer (disease or eventfree survival). Such period of time may be one year, two years, two tofive years, five years, five to ten years, ten years, or ten or moreyears from the initial testing date for determination of the referencevalue. Furthermore, retrospective measurement of DETERMINANTS inproperly banked historical subject samples may be used in establishingthese reference values, thus shortening the study time required.

A reference value can also comprise the amounts of DETERMINANTS derivedfrom subjects who show an improvement in metastatic or recurrence riskfactors as a result of treatments and/or therapies for the cancer. Areference value can also comprise the amounts of DETERMINANTS derivedfrom subjects who have confirmed disease by known invasive ornon-invasive techniques, or are at high risk for developing prostatecancer, or who have suffered from metastatic or reoccurant prostatecancer.

In another embodiment, the reference value is an index value or abaseline value. An index value or baseline value is a composite sampleof an effective amount of DETERMINANTS from one or more subjects who donot have prostate cancer, or subjects who are asymptomatic a metastaticcancer. A baseline value can also comprise the amounts of DETERMINANTSin a sample derived from a subject who has shown an improvement inprostate cancer risk factors as a result of cancer treatments ortherapies. In this embodiment, to make comparisons to thesubject-derived sample, the amounts of DETERMINANTS are similarlycalculated and compared to the index value. Optionally, subjectsidentified as having prostate cancer, being at increased risk ofdeveloping metastatic prostate cancer or prostate cancer reoccurrenceare chosen to receive a therapeutic regimen to slow the progression thecancer, or decrease or prevent the risk of developing metastatic orreoccurent prostate cancer.

The progression of metastatic prostate cancer, or effectiveness of acancer treatment regimen can be monitored by detecting a DETERMINANT inan effective amount (which may be two or more) of samples obtained froma subject over time and comparing the amount of DETERMINANTS detected.For example, a first sample can be obtained prior to the subjectreceiving treatment and one or more subsequent samples are taken afteror during treatment of the subject. The cancer is considered to beprogressive (or, alternatively, the treatment does not preventprogression) if the amount of DETERMINANT changes over time relative tothe reference value, whereas the cancer is not progressive if the amountof DETERMINANTS remains constant over time (relative to the referencepopulation, or “constant” as used herein). The term “constant” as usedin the context of the present invention is construed to include changesover time with respect to the reference value.

For example, the methods of the invention can be used to discriminatethe aggressiveness/and or accessing the stage of the tumor (e.g. StageI, II, II or IV). This will allow patients to be stratified into high orlow risk groups and treated accordingly.

Additionally, therapeutic or prophylactic agents suitable foradministration to a particular subject can be identified by detecting aDETERMINANT in an effective amount (which may be two or more) in asample obtained from a subject, exposing the subject-derived sample to atest compound that determines the amount (which may be two or more) ofDETERMINANTS in the subject-derived sample. Accordingly, treatments ortherapeutic regimens for use in subjects having a cancer, or subjects atrisk for developing prostate cancer can be selected based on the amountsof DETERMINANTS in samples obtained from the subjects and compared to areference value. Two or more treatments or therapeutic regimens can beevaluated in parallel to determine which treatment or therapeuticregimen would be the most efficacious for use in a subject to delayonset, or slow progression of the cancer.

The present invention further provides a method for screening forchanges in marker expression associated with metastatic prostate cancer,by determining the amount (which may be two or more) of DETERMINANTS ina subject-derived sample, comparing the amounts of the DETERMINANTS in areference sample, and identifying alterations in amounts in the subjectsample compared to the reference sample.

The present invention further provides a method of treating a patientwith a tumor, by identifying a patient with a tumor where an effectiveamount of DETERMINANTS are altered in a clinically significant manner asmeasured in a sample from the tumor, an treating the patient with atherapeutic regimen that prevents or reduces tumor metastasis.

Additionally the invention provides a method of selecting a tumorpatient in need of adjuvant treatment by assessing the risk ofmetastasis or recurrence in the patient by measuring an effective amountof DETERMINANTS where a clinically significant alteration two or moreDETERMINANTS in a tumor sample from the patient indicates that thepatient is in need of adjuvant treatment.

Information regarding a treatment decision for a tumor patient byobtaining information on an effective amount of DETERMINANTS in a tumorsample from the patient, and selecting a treatment regimen that preventsor reduces tumor metastasis or recurrence in the patient if two or moreDETERMINANTS are altered in a clinically significant manner.

If the reference sample, e.g., a control sample, is from a subject thatdoes not have a metastatic cancer, or if the reference sample reflects avalue that is relative to a person that has a high likelihood of rapidprogression to metastatic prostate cancer, a similarity in the amount ofthe DETERMINANT in the test sample and the reference sample indicatesthat the treatment is efficacious. However, a difference in the amountof the DETERMINANT in the test sample and the reference sample indicatesa less favorable clinical outcome or prognosis.

By “efficacious”, it is meant that the treatment leads to a decrease inthe amount or activity of a DETERMINANT protein, nucleic acid,polymorphism, metabolite, or other analyte. Assessment of the riskfactors disclosed herein can be achieved using standard clinicalprotocols. Efficacy can be determined in association with any knownmethod for diagnosing, identifying, or treating a disease.

The present invention also provides DETERMINANT panels including one ormore DETERMINANTS that are indicative of a general physiological pathwayassociated with a metastatic lesion. For example, one or moreDETERMINANTS that can be used to exclude or distinguish betweendifferent disease states that are associated with metastasis. A singleDETERMINANT may have several of the aforementioned characteristicsaccording to the present invention, and may alternatively be used inreplacement of one or more other DETERMINANTS where appropriate for thegiven application of the invention.

The present invention also comprises a kit with a detection reagent thatbinds to two or more DETERMINANT proteins, nucleic acids, polymorphisms,metabolites, or other analytes. Also provided by the invention is anarray of detection reagents, e.g., antibodies and/or oligonucleotidesthat can bind to two or more DETERMINANT proteins or nucleic acids,respectively. In one embodiment, the DETERMINANT are proteins and thearray contains antibodies that bind two or more DETERMINANTS listed onTable 2, 3, or 5 sufficient to measure a statistically significantalteration in DETERMINANT expression compared to a reference value. Inanother embodiment, the DETERMINANTS are nucleic acids and the arraycontains oligonucleotides or aptamers that bind an effective amount ofDETERMINANTS listed on Table 2, 3, or 5 sufficient to measure astatistically significant alteration in DETERMINANT expression comparedto a reference value.

In another embodiment, the DETERMINANT are proteins and the arraycontains antibodies that bind an effective amount of DETERMINANTS listedTables 2 or 3 sufficient to measure a statistically significantalteration in DETERMINANT expression compared to a reference value. Inanother embodiment, the DETERMINANTS are nucleic acids and the arraycontains oligonucleotides or aptamers that bind an effective amount ofDETERMINANTS listed on any one of Table 2, 3, or 5 sufficient to measurea statistically significant alteration in DETERMINANT expressioncompared to a reference value.

Also provided by the present invention is a method for treating one ormore subjects at risk for developing a prostate cancer by detecting thepresence of altered amounts of an effective amount of DETERMINANTSpresent in a sample from the one or more subjects; and treating the oneor more subjects with one or more cancer-modulating drugs until alteredamounts or activity of the DETERMINANTS return to a baseline valuemeasured in one or more subjects at low risk for developing a metastaticdisease, or alternatively, in subjects who do not exhibit any of thetraditional risk factors for metastatic disease.

Also provided by the present invention is a method for treating one ormore subjects having prostate cancer by detecting the presence ofaltered levels of an effective amount of DETERMINANTS present in asample from the one or more subjects; and treating the one or moresubjects with one or more cancer-modulating drugs until altered amountsor activity of the DETERMINANTS return to a baseline value measured inone or more subjects at low risk for developing prostate cancer.

Also provided by the present invention is a method for evaluatingchanges in the risk of developing metastatic prostate cancer in asubject diagnosed with cancer, by detecting an effective amount ofDETERMINANTS (which may be two or more) in a first sample from thesubject at a first period of time, detecting the amounts of theDETERMINANTS in a second sample from the subject at a second period oftime, and comparing the amounts of the DETERMINANTS detected at thefirst and second periods of time.

Diagnostic and Prognostic Indications of the Invention

The invention allows the diagnosis and prognosis of a primary, and/orlocally invasive cancer such as prostate, breast, among cancer types.The risk of developing metastatic prostate cancer or prostate cancerrecurrence can be detected by measuring an effective amount ofDETERMINANT proteins, nucleic acids, polymorphisms, metabolites, andother analytes (which may be two or more) in a test sample (e.g., asubject derived sample), and comparing the effective amounts toreference or index values, often utilizing mathematical algorithms orformula in order to combine information from results of multipleindividual DETERMINANTS and from non-analyte clinical parameters into asingle measurement or index. Subjects identified as having an increasedrisk of a metastatic prostate cancer, prostate cancer recurrence, orother metastatic cancer types can optionally be selected to receivetreatment regimens, such as administration of prophylactic ortherapeutic compounds to prevent or delay the onset of metastaticprostate cancer, prostate cancer recurrence or other metastatic cancertypes.

The amount of the DETERMINANT protein, nucleic acid, polymorphism,metabolite, or other analyte can be measured in a test sample andcompared to the “normal control level,” utilizing techniques such asreference limits, discrimination limits, or risk defining thresholds todefine cutoff points and abnormal values. The “normal control level”means the level of one or more DETERMINANTS or combined DETERMINANTindices typically found in a subject not suffering from a prostatecancer. Such normal control level and cutoff points may vary based onwhether a DETERMINANT is used alone or in a formula combining with otherDETERMINANTS into an index. Alternatively, the normal control level canbe a database of DETERMINANT patterns from previously tested subjectswho did not develop a prostate cancer over a clinically relevant timehorizon.

The present invention may be used to make continuous or categoricalmeasurements of the risk of conversion to metastatic prostate cancer,prostate cancer recurrence or other metastatic cancer types thusdiagnosing and defining the risk spectrum of a category of subjectsdefined as at risk for having a metastatic or recurrent event. In thecategorical scenario, the methods of the present invention can be usedto discriminate between normal and disease subject cohorts. In otherembodiments, the present invention may be used so as to discriminatethose at risk for having a metastatic or recurrent event from thosehaving more rapidly progressing (or alternatively those with a shorterprobable time horizon to a metastatic or recurrent event) to ametastatic event from those more slowly progressing (or with a longertime horizon to a metastatic event), or those having metastatic cancerfrom normal. Such differing use may require different DETERMINANTcombinations in individual panel, mathematical algorithm, and/or cut-offpoints, but be subject to the same aforementioned measurements ofaccuracy and other performance metrics relevant for the intended use.

Identifying the subject at risk of having a metastatic or recurrentevent enables the selection and initiation of various therapeuticinterventions or treatment regimens in order to delay, reduce or preventthat subject's conversion to a metastatic disease state. Levels of aneffective amount of DETERMINANT proteins, nucleic acids, polymorphisms,metabolites, or other analytes also allows for the course of treatmentof a metastatic disease or metastatic event to be monitored. In thismethod, a biological sample can be provided from a subject undergoingtreatment regimens, e.g., drug treatments, for cancer. If desired,biological samples are obtained from the subject at various time pointsbefore, during, or after treatment.

By virtue of some DETERMINANTS' being functionally active, byelucidating its function, subjects with high DETERMINANTS, for example,can be managed with agents/drugs that preferentially target suchfunction.

The present invention can also be used to screen patient or subjectpopulations in any number of settings. For example, a health maintenanceorganization, public health entity or school health program can screen agroup of subjects to identify those requiring interventions, asdescribed above, or for the collection of epidemiological data.Insurance companies (e.g., health, life or disability) may screenapplicants in the process of determining coverage or pricing, orexisting clients for possible intervention. Data collected in suchpopulation screens, particularly when tied to any clinical progressionto conditions like cancer or metastatic events, will be of value in theoperations of, for example, health maintenance organizations, publichealth programs and insurance companies. Such data arrays or collectionscan be stored in machine-readable media and used in any number ofhealth-related data management systems to provide improved healthcareservices, cost effective healthcare, improved insurance operation, etc.See, for example, U.S. Patent Application No. 2002/0038227; U.S. PatentApplication No. US 2004/0122296; U.S. Patent Application No. US2004/0122297; and U.S. Pat. No. 5,018,067. Such systems can access thedata directly from internal data storage or remotely from one or moredata storage sites as further detailed herein.

A machine-readable storage medium can comprise a data storage materialencoded with machine readable data or data arrays which, when using amachine programmed with instructions for using said data, is capable ofuse for a variety of purposes, such as, without limitation, subjectinformation relating to metastatic disease risk factors over time or inresponse drug therapies. Measurements of effective amounts of thebiomarkers of the invention and/or the resulting evaluation of risk fromthose biomarkers can implemented in computer programs executing onprogrammable computers, comprising, inter alia, a processor, a datastorage system (including volatile and non-volatile memory and/orstorage elements), at least one input device, and at least one outputdevice. Program code can be applied to input data to perform thefunctions described above and generate output information. The outputinformation can be applied to one or more output devices, according tomethods known in the art. The computer may be, for example, a personalcomputer, microcomputer, or workstation of conventional design.

Each program can be implemented in a high level procedural or objectoriented programming language to communicate with a computer system.However, the programs can be implemented in assembly or machinelanguage, if desired. The language can be a compiled or interpretedlanguage. Each such computer program can be stored on a storage media ordevice (e.g., ROM or magnetic diskette or others as defined elsewhere inthis disclosure) readable by a general or special purpose programmablecomputer, for configuring and operating the computer when the storagemedia or device is read by the computer to perform the proceduresdescribed herein. The health-related data management system of theinvention may also be considered to be implemented as acomputer-readable storage medium, configured with a computer program,where the storage medium so configured causes a computer to operate in aspecific and predefined manner to perform various functions describedherein.

Levels of an effective amount of DETERMINANT proteins, nucleic acids,polymorphisms, metabolites, or other analytes can then be determined andcompared to a reference value, e.g. a control subject or populationwhose metastatic state is known or an index value or baseline value. Thereference sample or index value or baseline value may be taken orderived from one or more subjects who have been exposed to thetreatment, or may be taken or derived from one or more subjects who areat low risk of developing cancer or a metastatic event, or may be takenor derived from subjects who have shown improvements in as a result ofexposure to treatment. Alternatively, the reference sample or indexvalue or baseline value may be taken or derived from one or moresubjects who have not been exposed to the treatment. For example,samples may be collected from subjects who have received initialtreatment for cancer or a metastatic event and subsequent treatment forcancer or a metastatic event to monitor the progress of the treatment. Areference value can also comprise a value derived from risk predictionalgorithms or computed indices from population studies such as thosedisclosed herein.

The DETERMINANTS of the present invention can thus be used to generate a“reference DETERMINANT profile” of those subjects who do not have canceror are not at risk of having a metastatic event, and would not beexpected to develop cancer or a metastatic event. The DETERMINANTSdisclosed herein can also be used to generate a “subject DETERMINANTprofile” taken from subjects who have cancer or are at risk for having ametastatic event. The subject DETERMINANT profiles can be compared to areference DETERMINANT profile to diagnose or identify subjects at riskfor developing cancer or a metastatic event, to monitor the progressionof disease, as well as the rate of progression of disease, and tomonitor the effectiveness of treatment modalities. The reference andsubject DETERMINANT profiles of the present invention can be containedin a machine-readable medium, such as but not limited to, analog tapeslike those readable by a VCR, CD-ROM, DVD-ROM, USB flash media, amongothers. Such machine-readable media can also contain additional testresults, such as, without limitation, measurements of clinicalparameters and traditional laboratory risk factors. Alternatively oradditionally, the machine-readable media can also comprise subjectinformation such as medical history and any relevant family history. Themachine-readable media can also contain information relating to otherdisease-risk algorithms and computed indices such as those describedherein.

Differences in the genetic makeup of subjects can result in differencesin their relative abilities to metabolize various drugs, which maymodulate the symptoms or risk factors of cancer or metastatic events.Subjects that have cancer, or at risk for developing cancer, a recurrentcancer or a metastatic cancer can vary in age, ethnicity, and otherparameters. Accordingly, use of the DETERMINANTS disclosed herein, bothalone and together in combination with known genetic factors for drugmetabolism, allow for a pre-determined level of predictability that aputative therapeutic or prophylactic to be tested in a selected subjectwill be suitable for treating or preventing cancer or a metastatic eventin the subject.

To identify therapeutics or drugs that are appropriate for a specificsubject, a test sample from the subject can also be exposed to atherapeutic agent or a drug, and the level of one or more of DETERMINANTproteins, nucleic acids, polymorphisms, metabolites or other analytescan be determined. The amount of one or more DETERMINANTS can becompared to sample derived from the subject before and after treatmentor exposure to a therapeutic agent or a drug, or can be compared tosamples derived from one or more subjects who have shown improvements inrisk factors (e.g., clinical parameters or traditional laboratory riskfactors) as a result of such treatment or exposure.

A subject cell (i.e., a cell isolated from a subject) can be incubatedin the presence of a candidate agent and the pattern of DETERMINANTexpression in the test sample is measured and compared to a referenceprofile, e.g., a metastatic disease reference expression profile or anon-disease reference expression profile or an index value or baselinevalue. The test agent can be any compound or composition or combinationthereof, including, dietary supplements. For example, the test agentsare agents frequently used in cancer treatment regimens and aredescribed herein.

The aforementioned methods of the invention can be used to evaluate ormonitor the progression and/or improvement of subjects who have beendiagnosed with a cancer, and who have undergone surgical interventions.

Performance and Accuracy Measures of the Invention

The performance and thus absolute and relative clinical usefulness ofthe invention may be assessed in multiple ways as noted above. Amongstthe various assessments of performance, the invention is intended toprovide accuracy in clinical diagnosis and prognosis. The accuracy of adiagnostic or prognostic test, assay, or method concerns the ability ofthe test, assay, or method to distinguish between subjects havingcancer, or at risk for cancer or a metastatic event, is based on whetherthe subjects have, a “significant alteration” (e.g., clinicallysignificant “diagnostically significant) in the levels of a DETERMINANT.By “effective amount” it is meant that the measurement of an appropriatenumber of DETERMINANTS (which may be one or more) to produce a“significant alteration,” (e.g. level of expression or activity of aDETERMINANT) that is different than the predetermined cut-off point (orthreshold value) for that DETERMINANT(S) and therefore indicates thatthe subject has cancer or is at risk for having a metastatic event forwhich the DETERMINANT(S) is a determinant. The difference in the levelof DETERMINANT between normal and abnormal is preferably statisticallysignificant. As noted below, and without any limitation of theinvention, achieving statistical significance, and thus the preferredanalytical, diagnostic, and clinical accuracy, generally but not alwaysrequires that combinations of several DETERMINANTS be used together inpanels and combined with mathematical algorithms in order to achieve astatistically significant DETERMINANT index.

In the categorical diagnosis of a disease state, changing the cut pointor threshold value of a test (or assay) usually changes the sensitivityand specificity, but in a qualitatively inverse relationship. Therefore,in assessing the accuracy and usefulness of a proposed medical test,assay, or method for assessing a subject's condition, one should alwaystake both sensitivity and specificity into account and be mindful ofwhat the cut point is at which the sensitivity and specificity are beingreported because sensitivity and specificity may vary significantly overthe range of cut points. Use of statistics such as AUC, encompassing allpotential cut point values, is preferred for most categorical riskmeasures using the invention, while for continuous risk measures,statistics of goodness-of-fit and calibration to observed results orother gold standards, are preferred.

By predetermined level of predictability it is meant that the methodprovides an acceptable level of clinical or diagnostic accuracy. Usingsuch statistics, an “acceptable degree of diagnostic accuracy”, isherein defined as a test or assay (such as the test of the invention fordetermining the clinically significant presence of DETERMINANTS, whichthereby indicates the presence of cancer and/or a risk of having ametastatic event) in which the AUC (area under the ROC curve for thetest or assay) is at least 0.60, desirably at least 0.65, more desirablyat least 0.70, preferably at least 0.75, more preferably at least 0.80,and most preferably at least 0.85.

By a “very high degree of diagnostic accuracy”, it is meant a test orassay in which the AUC (area under the ROC curve for the test or assay)is at least 0.75, 0.80, desirably at least 0.85, more desirably at least0.875, preferably at least 0.90, more preferably at least 0.925, andmost preferably at least 0.95.

Alternatively, the methods predict the presence or absence of a cancer,metastatic cancer or response to therapy with at least 75% accuracy,more preferably 80%, 85%, 90%, 95%, 97%, 98%, 99% or greater accuracy.

The predictive value of any test depends on the sensitivity andspecificity of the test, and on the prevalence of the condition in thepopulation being tested. This notion, based on Bayes' theorem, providesthat the greater the likelihood that the condition being screened for ispresent in an individual or in the population (pre-test probability),the greater the validity of a positive test and the greater thelikelihood that the result is a true positive. Thus, the problem withusing a test in any population where there is a low likelihood of thecondition being present is that a positive result has limited value(i.e., more likely to be a false positive). Similarly, in populations atvery high risk, a negative test result is more likely to be a falsenegative.

As a result, ROC and AUC can be misleading as to the clinical utility ofa test in low disease prevalence tested populations (defined as thosewith less than 1% rate of occurrences (incidence) per annum, or lessthan 10% cumulative prevalence over a specified time horizon).Alternatively, absolute risk and relative risk ratios as definedelsewhere in this disclosure can be employed to determine the degree ofclinical utility. Populations of subjects to be tested can also becategorized into quartiles by the test's measurement values, where thetop quartile (25% of the population) comprises the group of subjectswith the highest relative risk for developing cancer or metastaticevent, and the bottom quartile comprising the group of subjects havingthe lowest relative risk for developing cancer or a metastatic event.Generally, values derived from tests or assays having over 2.5 times therelative risk from top to bottom quartile in a low prevalence populationare considered to have a “high degree of diagnostic accuracy,” and thosewith five to seven times the relative risk for each quartile areconsidered to have a “very high degree of diagnostic accuracy.”Nonetheless, values derived from tests or assays having only 1.2 to 2.5times the relative risk for each quartile remain clinically useful arewidely used as risk factors for a disease; such is the case with totalcholesterol and for many inflammatory biomarkers with respect to theirprediction of future metastatic events. Often such lower diagnosticaccuracy tests must be combined with additional parameters in order toderive meaningful clinical thresholds for therapeutic intervention, asis done with the aforementioned global risk assessment indices.

A health economic utility function is an yet another means of measuringthe performance and clinical value of a given test, consisting ofweighting the potential categorical test outcomes based on actualmeasures of clinical and economic value for each. Health economicperformance is closely related to accuracy, as a health economic utilityfunction specifically assigns an economic value for the benefits ofcorrect classification and the costs of misclassification of testedsubjects. As a performance measure, it is not unusual to require a testto achieve a level of performance which results in an increase in healtheconomic value per test (prior to testing costs) in excess of the targetprice of the test.

In general, alternative methods of determining diagnostic accuracy arecommonly used for continuous measures, when a disease category or riskcategory (such as those at risk or having a metastatic event) has notyet been clearly defined by the relevant medical societies and practiceof medicine, where thresholds for therapeutic use are not yetestablished, or where there is no existing gold standard for diagnosisof the pre-disease. For continuous measures of risk, measures ofdiagnostic accuracy for a calculated index are typically based on curvefit and calibration between the predicted continuous value and theactual observed values (or a historical index calculated value) andutilize measures such as R squared, Hosmer-Lemeshow P-value statisticsand confidence intervals. It is not unusual for predicted values usingsuch algorithms to be reported including a confidence interval (usually90% or 95% CI) based on a historical observed cohort's predictions, asin the test for risk of future breast cancer recurrence commercializedby Genomic Health, Inc. (Redwood City, Calif.).

In general, by defining the degree of diagnostic accuracy, i.e., cutpoints on a ROC curve, defining an acceptable AUC value, and determiningthe acceptable ranges in relative concentration of what constitutes aneffective amount of the DETERMINANTS of the invention allows for one ofskill in the art to use the DETERMINANTS to identify, diagnose, orprognose subjects with a pre-determined level of predictability andperformance.

Risk Markers of the Invention (Determinants)

The biomarkers and methods of the present invention allow one of skillin the art to identify, diagnose, or otherwise assess those subjects whodo not exhibit any symptoms of cancer or a metastatic event, but whononetheless may be at risk for developing cancer or a metastatic event.

One skilled in the art will recognize that the DETERMINANTS presentedherein encompasses all forms and variants, including but not limited to,polymorphisms, isoforms, mutants, derivatives, precursors includingnucleic acids and pro-proteins, cleavage products, receptors (includingsoluble and transmembrane receptors), ligands, protein-ligand complexes,and post-translationally modified variants (such as cross-linking orglycosylation), fragments, and degradation products, as well as anymulti-unit nucleic acid, protein, and glycoprotein structures comprisedof any of the DETERMINANTS as constituent sub-units of the fullyassembled structure.

One skilled in the art will note that the above listed DETERMINANTS comefrom a diverse set of physiological and biological pathways, includingmany which are not commonly accepted to be related to metastaticdisease. These groupings of different DETERMINANTS, even within thosehigh significance segments, may presage differing signals of the stageor rate of the progression of the disease. Such distinct groupings ofDETERMINANTS may allow a more biologically detailed and clinicallyuseful signal from the DETERMINANTS as well as opportunities for patternrecognition within the DETERMINANT algorithms combining the multipleDETERMINANT signals.

The present invention concerns, in one aspect, a subset of DETERMINANTS;other DETERMINANTS and even biomarkers which are not listed in Table 2,3, or 5, but related to these physiological and biological pathways, mayprove to be useful given the signal and information provided from thesestudies. To the extent that other biomarker pathway participants (i.e.,other biomarker participants in common pathways with those biomarkerscontained within the list of DETERMINANTS in Table 2, 3, or 5) are alsorelevant pathway participants in cancer or a metastatic event, they maybe functional equivalents to the biomarkers thus far disclosed in Table2, 3, or 5. These other pathway participants are also consideredDETERMINANTS in the context of the present invention, provided theyadditionally share certain defined characteristics of a good biomarker,which would include both involvement in the herein disclosed biologicalprocesses and also analytically important characteristics such as thebioavailability of said biomarkers at a useful signal to noise ratio,and in a useful and accessible sample matrix such as blood serum or atumor biopsy. Such requirements typically limit the diagnosticusefulness of many members of a biological pathway, and frequentlyoccurs only in pathway members that constitute secretory substances,those accessible on the plasma membranes of cells, as well as those thatare released into the serum upon cell death, due to apoptosis or forother reasons such as endothelial remodeling or other cell turnover orcell necrotic processes, whether or not they are related to the diseaseprogression of cancer or metastatic event. However, the remaining andfuture biomarkers that meet this high standard for DETERMINANTS arelikely to be quite valuable.

Furthermore, other unlisted biomarkers will be very highly correlatedwith the biomarkers listed as DETERMINANTS in Table 2, 3, or 5 (for thepurpose of this application, any two variables will be considered to be“very highly correlated” when they have a Coefficient of Determination(R′) of 0.5 or greater). The present invention encompasses suchfunctional and statistical equivalents to the aforementionedDETERMINANTS. Furthermore, the statistical utility of such additionalDETERMINANTS is substantially dependent on the cross-correlation betweenmultiple biomarkers and any new biomarkers will often be required tooperate within a panel in order to elaborate the meaning of theunderlying biology.

One or more, preferably two or more of the listed DETERMINANTS can bedetected in the practice of the present invention. For example, two (2),three (3), four (4), five (5), ten (10), fifteen (15), twenty (20),forty (40), fifty (50), seventy-five (75), one hundred (100), onehundred and twenty five (125), one hundred and fifty (150), one hundredand seventy-five (175), two hundred (200), two hundred and ten (210),two hundred and twenty (220) or more DETERMINANTS can be detected.

In some aspects, all DETERMINANTS listed herein can be detected.Preferred ranges from which the number of DETERMINANTS can be detectedinclude ranges bounded by any minimum selected from between one and 741,particularly two, three, four, five, six, seven, eight, nine, ten,eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen,eighteen, nineteen, twenty, twenty-one, twenty-five, thirty, fifty,seventy-five, one hundred, one hundred and twenty five, one hundred andfifty, one hundred and seventy-five, two hundred, two hundred and ten,two hundred and twenty, paired with any maximum up to the total knownDETERMINANTS, particularly four, five, ten, twenty, fifty, andseventy-five. Particularly preferred ranges include two to five (2-5),two to ten (2-10), two to fifty (2-50), two to seventy-five (2-75), twoto one hundred (2-100), five to ten (5-10), five to twenty (5-20), fiveto fifty (5-50), five to seventy-five (5-75), five to one hundred(5-100), ten to twenty (10-20), ten to fifty (10-50), ten toseventy-five (10-75), ten to one hundred (10-100), twenty to fifty(20-50), twenty to seventy-five (20-75), twenty to one hundred (20-100),fifty to seventy-five (50-75), fifty to one hundred (50-100), onehundred to one hundred and twenty-five (100-125), one hundred andtwenty-five to one hundred and fifty (125-150), one hundred and fifty toone hundred and seventy five (150-175), one hundred and seventy-five totwo hundred (175-200), two hundred to two hundred and ten (200-210), twohundred and ten to two hundred and twenty (210-220).

Construction of Determinant Panels

Groupings of DETERMINANTS can be included in “panels.” A “panel” withinthe context of the present invention means a group of biomarkers(whether they are DETERMINANTS, clinical parameters, or traditionallaboratory risk factors) that includes more than one DETERMINANT. Apanel can also comprise additional biomarkers, e.g., clinicalparameters, traditional laboratory risk factors, known to be present orassociated with cancer or cancer metastasis, in combination with aselected group of the DETERMINANTS listed in Table 2.

As noted above, many of the individual DETERMINANTS, clinicalparameters, and traditional laboratory risk factors listed, when usedalone and not as a member of a multi-biomarker panel of DETERMINANTS,have little or no clinical use in reliably distinguishing individualnormal subjects, subjects at risk for having a metastatic event, andsubjects having cancer from each other in a selected general population,and thus cannot reliably be used alone in classifying any subjectbetween those three states. Even where there are statisticallysignificant differences in their mean measurements in each of thesepopulations, as commonly occurs in studies which are sufficientlypowered, such biomarkers may remain limited in their applicability to anindividual subject, and contribute little to diagnostic or prognosticpredictions for that subject. A common measure of statisticalsignificance is the p-value, which indicates the probability that anobservation has arisen by chance alone; preferably, such p-values are0.05 or less, representing a 5% or less chance that the observation ofinterest arose by chance. Such p-values depend significantly on thepower of the study performed.

Despite this individual DETERMINANT performance, and the generalperformance of formulas combining only the traditional clinicalparameters and few traditional laboratory risk factors, the presentinventors have noted that certain specific combinations of two or moreDETERMINANTS can also be used as multi-biomarker panels comprisingcombinations of DETERMINANTS that are known to be involved in one ormore physiological or biological pathways, and that such information canbe combined and made clinically useful through the use of variousformulae, including statistical classification algorithms and others,combining and in many cases extending the performance characteristics ofthe combination beyond that of the individual DETERMINANTS. Thesespecific combinations show an acceptable level of diagnostic accuracy,and, when sufficient information from multiple DETERMINANTS is combinedin a trained formula, often reliably achieve a high level of diagnosticaccuracy transportable from one population to another.

The general concept of how two less specific or lower performingDETERMINANTS are combined into novel and more useful combinations forthe intended indications, is a key aspect of the invention. Multiplebiomarkers can often yield better performance than the individualcomponents when proper mathematical and clinical algorithms are used;this is often evident in both sensitivity and specificity, and resultsin a greater AUC. Secondly, there is often novel unperceived informationin the existing biomarkers, as such was necessary in order to achievethrough the new formula an improved level of sensitivity or specificity.This hidden information may hold true even for biomarkers which aregenerally regarded to have suboptimal clinical performance on their own.In fact, the suboptimal performance in terms of high false positiverates on a single biomarker measured alone may very well be an indicatorthat some important additional information is contained within thebiomarker results—information which would not be elucidated absent thecombination with a second biomarker and a mathematical formula.

Several statistical and modeling algorithms known in the art can be usedto both assist in DETERMINANT selection choices and optimize thealgorithms combining these choices. Statistical tools such as factor andcross-biomarker correlation/covariance analyses allow more rationaleapproaches to panel construction. Mathematical clustering andclassification tree showing the Euclidean standardized distance betweenthe DETERMINANTS can be advantageously used. Pathway informed seeding ofsuch statistical classification techniques also may be employed, as mayrational approaches based on the selection of individual DETERMINANTSbased on their participation across in particular pathways orphysiological functions.

Ultimately, formula such as statistical classification algorithms can bedirectly used to both select DETERMINANTS and to generate and train theoptimal formula necessary to combine the results from multipleDETERMINANTS into a single index. Often, techniques such as forward(from zero potential explanatory parameters) and backwards selection(from all available potential explanatory parameters) are used, andinformation criteria, such as AIC or BIC, are used to quantify thetradeoff between the performance and diagnostic accuracy of the paneland the number of DETERMINANTS used. The position of the individualDETERMINANT on a forward or backwards selected panel can be closelyrelated to its provision of incremental information content for thealgorithm, so the order of contribution is highly dependent on the otherconstituent DETERMINANTS in the panel.

Construction of Clinical Algorithms

Any formula may be used to combine DETERMINANT results into indicesuseful in the practice of the invention. As indicated above, and withoutlimitation, such indices may indicate, among the various otherindications, the probability, likelihood, absolute or relative risk,time to or rate of conversion from one to another disease states, ormake predictions of future biomarker measurements of metastatic disease.This may be for a specific time period or horizon, or for remaininglifetime risk, or simply be provided as an index relative to anotherreference subject population.

Although various preferred formula are described here, several othermodel and formula types beyond those mentioned herein and in thedefinitions above are well known to one skilled in the art. The actualmodel type or formula used may itself be selected from the field ofpotential models based on the performance and diagnostic accuracycharacteristics of its results in a training population. The specificsof the formula itself may commonly be derived from DETERMINANT resultsin the relevant training population. Amongst other uses, such formulamay be intended to map the feature space derived from one or moreDETERMINANT inputs to a set of subject classes (e.g. useful inpredicting class membership of subjects as normal, at risk for having ametastatic event, having cancer), to derive an estimation of aprobability function of risk using a Bayesian approach (e.g. the risk ofcancer or a metastatic event), or to estimate the class-conditionalprobabilities, then use Bayes' rule to produce the class probabilityfunction as in the previous case.

Preferred formulas include the broad class of statistical classificationalgorithms, and in particular the use of discriminant analysis. The goalof discriminant analysis is to predict class membership from apreviously identified set of features. In the case of lineardiscriminant analysis (LDA), the linear combination of features isidentified that maximizes the separation among groups by some criteria.Features can be identified for LDA using an eigengene based approachwith different thresholds (ELDA) or a stepping algorithm based on amultivariate analysis of variance (MANOVA). Forward, backward, andstepwise algorithms can be performed that minimize the probability of noseparation based on the Hotelling-Lawley statistic.

Eigengene-based Linear Discriminant Analysis (ELDA) is a featureselection technique developed by Shen et al. (2006). The formula selectsfeatures (e.g. biomarkers) in a multivariate framework using a modifiedeigen analysis to identify features associated with the most importanteigenvectors. “Important” is defined as those eigenvectors that explainthe most variance in the differences among samples that are trying to beclassified relative to some threshold.

A support vector machine (SVM) is a classification formula that attemptsto find a hyperplane that separates two classes. This hyperplanecontains support vectors, data points that are exactly the margindistance away from the hyperplane. In the likely event that noseparating hyperplane exists in the current dimensions of the data, thedimensionality is expanded greatly by projecting the data into largerdimensions by taking non-linear functions of the original variables(Venables and Ripley, 2002). Although not required, filtering offeatures for SVM often improves prediction. Features (e.g., biomarkers)can be identified for a support vector machine using a non-parametricKruskal-Wallis (KW) test to select the best univariate features. Arandom forest (RF, Brciman, 2001) or recursive partitioning (RPART,Breiman et al., 1984) can also be used separately or in combination toidentify biomarker combinations that are most important. Both KW and RFrequire that a number of features be selected from the total. RPARTcreates a single classification tree using a subset of availablebiomarkers.

Other formula may be used in order to pre-process the results ofindividual DETERMINANT measurement into more valuable forms ofinformation, prior to their presentation to the predictive formula. Mostnotably, normalization of biomarker results, using either commonmathematical transformations such as logarithmic or logistic functions,as normal or other distribution positions, in reference to apopulation's mean values, etc. are all well known to those skilled inthe art. Of particular interest are a set of normalizations based onClinical Parameters such as age, gender, race, or sex, where specificformula are used solely on subjects within a class or continuouslycombining a Clinical Parameter as an input. In other cases,analyte-based biomarkers can be combined into calculated variables whichare subsequently presented to a formula.

In addition to the individual parameter values of one subjectpotentially being normalized, an overall predictive formula for allsubjects, or any known class of subjects, may itself be recalibrated orotherwise adjusted based on adjustment for a population's expectedprevalence and mean biomarker parameter values, according to thetechnique outlined in D'Agostino et al, JAMA (2001), or other similarnormalization and recalibration techniques. Such epidemiologicaladjustment statistics may be captured, confirmed, improved and updatedcontinuously through a registry of past data presented to the model,which may be machine readable or otherwise, or occasionally through theretrospective query of stored samples or reference to historical studiesof such parameters and statistics. Additional examples that may be thesubject of formula recalibration or other adjustments include statisticsused in studies by Pepe, M. S. et al, 2004 on the limitations of oddsratios; Cook, N. R., 2007 relating to ROC curves. Finally, the numericresult of a classifier formula itself may be transformed post-processingby its reference to an actual clinical population and study results andobserved endpoints, in order to calibrate to absolute risk and provideconfidence intervals for varying numeric results of the classifier orrisk formula. An example of this is the presentation of absolute risk,and confidence intervals for that risk, derived using an actual clinicalstudy, chosen with reference to the output of the recurrence scoreformula in the Oncotype Dx product of Genomic Health, Inc. (RedwoodCity, Calif.). A further modification is to adjust for smallersub-populations of the study based on the output of the classifier orrisk formula and defined and selected by their Clinical Parameters, suchas age or sex.

Combination with Clinical Parameters and Traditional Laboratory RiskFactors

Any of the aforementioned Clinical Parameters may be used in thepractice of the invention as a DETERMINANT input to a formula or as apre-selection criteria defining a relevant population to be measuredusing a particular DETERMINANT panel and formula. As noted above,Clinical Parameters may also be useful in the biomarker normalizationand pre-processing, or in DETERMINANT selection, panel construction,formula type selection and derivation, and formula resultpost-processing. A similar approach can be taken with the TraditionalLaboratory Risk Factors, as either an input to a formula or as apre-selection criterium.

Measurement of Determinants

The actual measurement of levels or amounts of the DETERMINANTS can bedetermined at the protein or nucleic acid level using any method knownin the art. For example, at the nucleic acid level, Northern andSouthern hybridization analysis, as well as ribonuclease protectionassays using probes which specifically recognize one or more of thesesequences can be used to determine gene expression. Alternatively,amounts of DETERMINANTS can be measured usingreverse-transcription-based PCR assays (RT-PCR), e.g., using primersspecific for the differentially expressed sequence of genes or bybranch-chain RNA amplification and detection methods by Panomics, Inc.Amounts of DETERMINANTS can also be determined at the protein level,e.g., by measuring the levels of peptides encoded by the gene productsdescribed herein, or subcellular localization or activities theretofusing technological platform such as for example AQUA® (HistoRx, NewHaven, Conn.) or U.S. Pat. No. 7,219,016. Such methods are well known inthe art and include, e.g., immunoassays based on antibodies to proteinsencoded by the genes, aptamers or molecular imprints. Any biologicalmaterial can be used for the detection/quantification of the protein orits activity. Alternatively, a suitable method can be selected todetermine the activity of proteins encoded by the marker genes accordingto the activity of each protein analyzed.

The DETERMINANT proteins, polypeptides, mutations, and polymorphismsthereof can be detected in any suitable manner, but is typicallydetected by contacting a sample from the subject with an antibody whichbinds the DETERMINANT protein, polypeptide, mutation, or polymorphismand then detecting the presence or absence of a reaction product. Theantibody may be monoclonal, polyclonal, chimeric, or a fragment of theforegoing, as discussed in detail above, and the step of detecting thereaction product may be carried out with any suitable immunoassay. Thesample from the subject is typically a biological fluid as describedabove, and may be the same sample of biological fluid used to conductthe method described above.

Immunoassays carried out in accordance with the present invention may behomogeneous assays or heterogeneous assays. In a homogeneous assay theimmunological reaction usually involves the specific antibody (e.g.,anti-DETERMINANT protein antibody), a labeled analyte, and the sample ofinterest. The signal arising from the label is modified, directly orindirectly, upon the binding of the antibody to the labeled analyte.Both the immunological reaction and detection of the extent thereof canbe carried out in a homogeneous solution. Immunochemical labels whichmay be employed include free radicals, radioisotopes, fluorescent dyes,enzymes, bacteriophages, or coenzymes.

In a heterogeneous assay approach, the reagents are usually the sample,the antibody, and means for producing a detectable signal. Samples asdescribed above may be used. The antibody can be immobilized on asupport, such as a bead (such as protein A and protein G agarose beads),plate or slide, and contacted with the specimen suspected of containingthe antigen in a liquid phase. The support is then separated from theliquid phase and either the support phase or the liquid phase isexamined for a detectable signal employing means for producing suchsignal. The signal is related to the presence of the analyte in thesample. Means for producing a detectable signal include the use ofradioactive labels, fluorescent labels, or enzyme labels. For example,if the antigen to be detected contains a second binding site, anantibody which binds to that site can be conjugated to a detectablegroup and added to the liquid phase reaction solution before theseparation step. The presence of the detectable group on the solidsupport indicates the presence of the antigen in the test sample.Examples of suitable immunoassays are oligonucleotides, imnmunoblotting,immunofluorescence methods, immunoprecipitation, chemiluminescencemethods, electrochemiluminescence (ECL) or enzyme-linked immunoassays.

Those skilled in the art will be familiar with numerous specificimmunoassay formats and variations thereof which may be useful forcarrying out the method disclosed herein. See generally E. Maggio,Enzyme-Immunoassay, (1980) (CRC Press, Inc., Boca Raton, Fla.); see alsoU.S. Pat. No. 4,727,022 to Skold et al. titled “Methods for ModulatingLigand-Receptor Interactions and their Application,” U.S. Pat. No.4,659,678 to Forrest et al. titled “Immunoassay of Antigens,” U.S. Pat.No. 4,376,110 to David et al., titled “Immunometric Assays UsingMonoclonal Antibodies,” U.S. Pat. No. 4,275,149 to Litman et al., titled“Macromolecular Environment Control in Specific Receptor Assays,” U.S.Pat. No. 4,233,402 to Maggio et al., titled “Reagents and MethodEmploying Channeling,” and U.S. Pat. No. 4,230,767 to Boguslaski et al.,titled “Heterogenous Specific Binding Assay Employing a Coenzyme asLabel.”

Antibodies can be conjugated to a solid support suitable for adiagnostic assay (e.g., beads such as protein A or protein G agarose,microspheres, plates, slides or wells formed from materials such aslatex or polystyrene) in accordance with known techniques, such aspassive binding. Antibodies as described herein may likewise beconjugated to detectable labels or groups such as radiolabels (e.g.,³⁵S, ¹²⁵I, ¹³¹I), enzyme labels (e.g., horseradish peroxidase, alkalinephosphatase), and fluorescent labels (e.g., fluorescein, Alexa, greenfluorescent protein, rhodamine) in accordance with known techniques.

Antibodies can also be useful for detecting post-translationalmodifications of DETERMINANT proteins, polypeptides, mutations, andpolymorphisms, such as tyrosine phosphorylation, threoninephosphorylation, scrine phosphorylation, glycosylation (e.g., O-GlcNAc).Such antibodies specifically detect the phosphorylated amino acids in aprotein or proteins of interest, and can be used in immunoblotting,immuno fluorescence, and ELISA assays described herein. These antibodiesare well-known to those skilled in the art, and commercially available.Post-translational modifications can also be determined using metastableions in reflector matrix-assisted laser desorption ionization-time offlight mass spectrometry (MALDI-TOF) (Wirth et al. Proteomics (2002)).

For DETERMINANT proteins, polypeptides, mutations, and polymorphismsknown to have enzymatic activity, the activities can be determined invitro using enzyme assays known in the art. Such assays include, withoutlimitation, kinase assays, phosphatase assays, reductase assays, amongmany others. Modulation of the kinetics of enzyme activities can bedetermined by measuring the rate constant K_(M) using known algorithms,such as the Hill plot, Michaelis-Menten equation, linear regressionplots such as Lineweaver-Burk analysis, and Scatchard plot.

Using sequence information provided by the database entries for theDETERMINANT sequences, expression of the DETERMINANT sequences can bedetected (if present) and measured using techniques well known to one ofordinary skill in the art. For example, sequences within the sequencedatabase entries corresponding to DETERMINANT sequences, or within thesequences disclosed herein, can be used to construct probes fordetecting DETERMINANT RNA sequences in, e.g., Northern blothybridization analyses or methods which specifically, and, preferably,quantitatively amplify specific nucleic acid sequences. As anotherexample, the sequences can be used to construct primers for specificallyamplifying the DETERMINANT sequences in, e.g., amplification-baseddetection methods such as reverse-transcription based polymerase chainreaction (RT-PCR). When alterations in gene expression are associatedwith gene amplification, deletion, polymorphisms, and mutations,sequence comparisons in test and reference populations can be made bycomparing relative amounts of the examined DNA sequences in the test andreference cell populations.

Expression of the genes disclosed herein can be measured at the RNAlevel using any method known in the art. For example, Northernhybridization analysis using probes which specifically recognize one ormore of these sequences can be used to determine gene expression.Alternatively, expression can be measured usingreverse-transcription-based PCR assays (RT-PCR), e.g., using primersspecific for the differentially expressed sequences. RNA can also bequantified using, for example, other target amplification methods (e.g.,TMA, SDA, NASBA), or signal amplification methods (e.g., bDNA), and thelike.

Alternatively, DETERMINANT protein and nucleic acid metabolites can bemeasured. The term “metabolite” includes any chemical or biochemicalproduct of a metabolic process, such as any compound produced by theprocessing, cleavage or consumption of a biological molecule (e.g., aprotein, nucleic acid, carbohydrate, or lipid). Metabolites can bedetected in a variety of ways known to one of skill in the art,including the refractive index spectroscopy (RI), ultra-violetspectroscopy (UV), fluorescence analysis, radiochemical analysis,near-infrared spectroscopy (near-IR), nuclear magnetic resonancespectroscopy (NMR), light scattering analysis (LS), mass spectrometry,pyrolysis mass spectrometry, nephelometry, dispersive Ramanspectroscopy, gas chromatography combined with mass spectrometry, liquidchromatography combined with mass spectrometry, matrix-assisted laserdesorption ionization-time of flight (MALDI-TOF) combined with massspectrometry, ion spray spectroscopy combined with mass spectrometry,capillary electrophoresis, NMR and IR detection. (See, WO 04/056456 andWO 04/088309, each of which are hereby incorporated by reference intheir entireties) In this regard, other DETERMINANT analytes can bemeasured using the above-mentioned detection methods, or other methodsknown to the skilled artisan. For example, circulating calcium ions(Ca²⁺) can be detected in a sample using fluorescent dyes such as theFluo series, Fura-2A, Rhod-2, among others. Other DETERMINANTmetabolites can be similarly detected using reagents that arespecifically designed or tailored to detect such metabolites.

Kits

The invention also includes a DETERMINANT-detection reagent, e.g.,nucleic acids that specifically identify one or more DETERMINANT nucleicacids by having homologous nucleic acid sequences, such asoligonucleotide sequences, complementary to a portion of the DETERMINANTnucleic acids or antibodies to proteins encoded by the DETERMINANTnucleic acids packaged together in the form of a kit. Theoligonucleotides can be fragments of the DETERMINANT genes. For examplethe oligonucleotides can be 200, 150, 100, 50, 25, 10 or lessnucleotides in length. The kit may contain in separate containers anucleic acid or antibody (either already bound to a solid matrix orpackaged separately with reagents for binding them to the matrix),control formulations (positive and/or negative), and/or a detectablelabel such as fluorescein, green fluorescent protein, rhodamine, cyaninedyes, Alexa dyes, luciferase, radiolabels, among others. Instructions(e.g., written, tape, VCR, CD-ROM, etc.) for carrying out the assay maybe included in the kit. The assay may for example be in the form of aNorthern hybridization or a sandwich ELISA as known in the art.

For example, DETERMINANT detection reagents can be immobilized on asolid matrix such as a porous strip to form at least one DETERMINANTdetection site. The measurement or detection region of the porous stripmay include a plurality of sites containing a nucleic acid. A test stripmay also contain sites for negative and/or positive controls.Alternatively, control sites can be located on a separate strip from thetest strip. Optionally, the different detection sites may containdifferent amounts of immobilized nucleic acids, e.g., a higher amount inthe first detection site and lesser amounts in subsequent sites. Uponthe addition of test sample, the number of sites displaying a detectablesignal provides a quantitative indication of the amount of DETERMINANTSpresent in the sample. The detection sites may be configured in anysuitably detectable shape and are typically in the shape of a bar or dotspanning the width of a test strip.

Alternatively, the kit contains a nucleic acid substrate arraycomprising one or more nucleic acid sequences. The nucleic acids on thearray specifically identify one or more nucleic acid sequencesrepresented by DETERMINANTS listed on Table 2, 3, or 5. In variousembodiments, the expression of 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25,40, 50, 100, 125, 150, 175, 200, 220 or more of the sequencesrepresented by DETERMINANTS listed on Table 2, 3, or 5 can be identifiedby virtue of binding to the array. The substrate array can be on, e.g.,a solid substrate, e.g., a “chip” as described in U.S. Pat. No.5,744,305. Alternatively, the substrate array can be a solution array,e.g., xMAP (Luminex, Austin, Tex.), Cyvera (Illumina, San Diego,Calif.), CellCard (Vitra Bioscience, Mountain View, Calif.) and QuantumDots' Mosaic (Invitrogen, Carlsbad, Calif.).

Suitable sources for antibodies for the detection of DETERMINANTSinclude commercially available sources such as, for example, Abazyme,Abnova, Affinity Biologicals, AntibodyShop, Biogenesis, BiosenseLaboratories, Calbiochem, Cell Sciences, Chemicon International,Chemokine, Clontech, Cytolab, DAKO, Diagnostic BioSystems, eBioscience,Endocrine Technologies, Enzo Biochem, Eurogentec, Fusion Antibodies,Genesis Biotech, GloboZymes, Haematologic Technologies, Immunodetect,Immunodiagnostik, Immunometrics, Immunostar, Immunovision, Biogenex,Invitrogen, Jackson immunoResearch Laboratory, KMI Diagnostics, KomaBiotech, LabFrontier Life Science Institute, Lee Laboratories,Lifescreen, Maine Biotechnology Services, Mediclone, MicroPharm Ltd.,ModiQuest, Molecular Innovations, Molecular Probes, Neoclone, Neuromics,New England Biolabs, Novocastra, Novus Biologicals, Oncogene ResearchProducts, Orbigen, Oxford Biotechnology, Panvera, PerkinElmer LifeSciences, Pharmingen, Phoenix Pharmaceuticals, Pierce Chemical Company,Polymun Scientific, Polysiences, Inc., Promega Corporation, Proteogenix,Protos Immunoresearch, QED Biosciences, Inc., R&D Systems, Repligen,Research Diagnostics, Roboscreen, Santa Cruz Biotechnology, SeikagakuAmerica, Serological Corporation, Serotec, SigmaAldrich, StemCellTechnologies, Synaptic Systems GmbH, Technopharm, Terra NovaBiotechnology, TiterMax, Trillium Diagnostics, Upstate Biotechnology, USBiological, Vector Laboratories, Wako Pure Chemical Industries, andZeptometrix. However, the skilled artisan can routinely make antibodies,nucleic acid probes, e.g., oligonucleotides, aptamers, siRNAs, antisenseoligonuclcotides, against any of the DETERMINANTS in Table 2 or Table 3.

Methods of Treating or Preventing Cancer

The invention provides a method for treating, preventing or alleviatinga symptom of cancer in a subject by decreasing expression or activity ofDETERMINANTS 1-300 or increasing expression or activity of DETERMINANTS301-741 Therapeutic compounds are administered prophylactically ortherapeutically to subject suffering from at risk of (or susceptible to)developing cancer. Such subjects are identified using standard clinicalmethods or by detecting an aberrant level of expression or activity of(e.g., DETERMINANTS 1-741). Therapeutic agents include inhibitors ofcell cycle regulation, cell proliferation, and protein kinase activity.

The therapeutic method includes increasing the expression, or function,or both of one or more gene products of genes whose expression isdecreased (“underexpressed genes”) in a cancer cell relative to normalcells of the same tissue type from which the cancer cells are derived.In these methods, the subject is treated with an effective amount of acompound, which increases the amount of one of more of theunderexpressed genes in the subject. Administration can be systemic orlocal. Therapeutic compounds include a polypeptide product of anunderexpressed gene, or a biologically active fragment thereof a nucleicacid encoding an underexpressed gene and having expression controlelements permitting expression in the cancer cells; for example an agentwhich increases the level of expression of such gene endogenous to thecancer cells (i.e., which up-regulates expression of the underexpressedgene or genes). Administration of such compounds counter the effects ofaberrantly-under expressed of the gene or genes in the subject's cellsand improves the clinical condition of the subject

The method also includes decreasing the expression, or function, orboth, of one or more gene products of genes whose expression isaberrantly increased (“overexpressed gene”) in cancer cells relative tonormal cells. Expression is inhibited in any of several ways known inthe art. For example, expression is inhibited by administering to thesubject a nucleic acid that inhibits, or antagonizes, the expression ofthe overexpressed gene or genes, e.g., an antisense oligonucleotidewhich disrupts expression of the overexpressed gene or genes.

Alternatively, function of one or more gene products of theoverexpressed genes is inhibited by administering a compound that bindsto or otherwise inhibits the function of the gene products. For example,the compound is an antibody which binds to the overexpressed geneproduct or gene products.

These modulatory methods are performed ex vivo or in vitro (e.g., byculturing the cell with the agent) or, alternatively, in vivo (e.g., byadministering the agent to a subject). The method involves administeringa protein or combination of proteins or a nucleic acid molecule orcombination of nucleic acid, molecules as therapy to counteract aberrantexpression or activity of the differentially expressed genes.

Diseases and disorders that are characterized by increased (relative toa subject not suffering from the disease or disorder) levels orbiological activity of the genes may be treated with therapeutics thatantagonize (i.e., reduce or inhibit) activity of the overexpressed geneor genes. Therapeutics that antagonize activity are administeredtherapeutically or prophylactically. (e.g. vaccines)

Therapeutics that may be utilized include, e.g., (i) a polypeptide, oranalogs, derivatives, fragments or homologs thereof of the overexpressedor underexpressed sequence or sequences; (ii) antibodies to theoverexpressed or underexpressed sequence or sequences; (iii) nucleicacids encoding the over or underexpressed sequence or sequences; (iv)antisense nucleic acids or nucleic acids that are “dysfunctional” (i.e.,due to a heterologous insertion within the coding sequences of codingsequences of one or more overexpressed or underexpressed sequences); (v)small molecules; (vi) siRNA, (vii) aptamers or (viii) modulators (i.e.,inhibitors, agonists and antagonists that alter the interaction betweenan over/underexpressed polypeptide and its binding partner. Thedysfunctional antisense molecule are utilized to “knockout” endogenousfunction of a polypeptide by homologous recombination (see, e.g.Capecchi, Science (1989))

Diseases and disorders that are characterized by decreased (relative toa subject not suffering from the disease or disorder) levels orbiological activity may be treated with therapeutics that increase(i.e., are agonists to) activity. Therapeutics that upregulate activitymay be administered in a therapeutic or prophylactic manner.Therapeutics that may be utilized include, but are not limited to, apolypeptide (or analogs, derivatives, fragments or homologs thereof) oran agonist that increases bioavailability.

Generation of Transgenic Animals

Transgenic animals of the invention have one or two null mTet alleles,harboring a Lox-Stop-Lox (LSL) cassette in the first intron of the mTetgene. Upon Cre-mediated excision of the LSL cassette, mTert expressionis restored. Transgenic animals of the invention have one or nullendogenous alleles of the Pten and p53 genes. Transgenic animals of theinvention have one or two null mTet alleles and one or null endogenousalleles of the Pten and p53 genes. Inactivation can be achieved bymodification of the endogenous gene, usually, a deletion, substitutionor addition to a coding or noncoding region of the gene. Themodification can prevent synthesis of a gene product or can result in agene product lacking functional activity. Typical modifications are theintroduction of an exogenous segment, such as a selection marker, withinan exon thereby disrupting the exon or the deletion of an exon.

Inactivation of endogenous genes in mice can be achieved by homologousrecombination between an endogenous gene in a mouse embryonic stem (ES)cell and a targeting construct. Typically, the targeting constructcontains a positive selection marker flanked by segments of the gene tobe targeted. Usually the segments are from the same species as the geneto be targeted (e.g., mouse). However, the segments can be obtained fromanother species, such as human, provided they have sufficient sequenceidentity with the gene to be targeted to undergo homologousrecombination with it. Typically, the construct also contains a negativeselection marker positioned outside one or both of the segments designedto undergo homologous recombination with the endogenous gene (see U.S.Pat. No. 6,204,061). Optionally, the construct also contains a pair ofsite-specific recombination sites, such as frt, position within or atthe ends of the segments designed to undergo homologous recombinationwith the endogenous gene. The construct is introduced into ES cells,usually by electroporation, and undergoes homologous recombination withthe endogenous gene introducing the positive selection marker and partsof the flanking segments (and frt sites, if present) into the endogenousgene. ES cells having undergone the desired recombination can beselected by positive and negative selection. Positive selection selectsfor cells that have undergone the desired homologous recombination, andnegative selection selects against cells that have undergone negativerecombination. These cells are obtained from preimplantation embryoscultured in vitro. Bradley et al., Nature (1984)) (incorporated byreference in its entirety for all purposes). Transformed ES cells arecombined with blastocysts from a non-human animal. The ES cells colonizethe embryo and in some embryos form or contribute to the germline of theresulting chimeric animal. See Jaenisch, Science, (1988) (incorporatedby reference in its entirety for all purposes). Chimeric animals can bebred with nontransgenic animals to generate heterozygous transgenicanimals. Heterozygous animals can be bred with each other to generatehomozygous animals. Either heterozygous or homozygous animals can bebred with a transgenic animal expressing the flp recombinase. Expressionof the recombinase results in excision of the portion of DNA betweenintroduced frt sites, if present.

Functional inactivation can also be achieved for other species, such asrats, rabbits and other rodents, ovines such as sheep, caprines such asgoats, porcines such as pigs, and bovines such as cattle and buffalo,are suitable. For animals other than mice, nuclear transfer technologyis preferred for generating functionally inactivated genes. See Lai etal., Sciences (2002). Various types of cells can be employed as donorsfor nuclei to be transferred into oocytes, including ES cells and fetalfibrocytes. Donor nuclei are obtained from cells cultured in vitro intowhich a construct has been introduced and undergone homologousrecombination with an endogenous gene, as described above (see WO98/37183 and WO 98/39416, each incorporated by reference in theirentirety for all purposes). Donor nuclei are introduced into oocytes bymeans of fusion, induced electrically or chemically (see any one of WO97/07669, WO 98/30683 and WO 98/39416), or by microinjection (see WO99/37143, incorporated by reference in its entirety for all purposes).Transplanted oocytes are subsequently cultured to develop into embryoswhich are subsequently implanted in the oviducts of pseudopregnantfemale animals, resulting in birth of transgenic offspring (see any oneof WO 97/07669, WO 98/30683 and WO 98/39416). Transgenic animals bearingheterozygous transgenes can be bred with each other to generatetransgenic animals bearing homozygous transgenes

The Cre/loxP system (conditional gene inactivation system) is a tool fortissue-specific (and in connection with the tet system alsotime-specific) inactivation of genes, for example, but not limited togenes that cannot be investigated in differentiated tissues because oftheir early embryonic lethality in mice with conventional knockouts. Itcan also be used for the removal of a transgene (which was overexpressedin a specific tissue) at a certain time point to study the invert effectof a downregulation of the transgene in a time course experiment. Ingeneral, two mouse lines are required for conditional gene inactivation.First, a conventional transgenic mouse line with Crc targeted to aspecific tissue or cell type, and secondly a mouse strain that embodiesa target gene (endogenous gene or transgene) flanked by two loxP sitesin a direct orientation (“floxed gene”). Recombination (excision andconsequently inactivation of the target gene) occurs only in those cellsexpressing Cre recombinase. Hence, the target gene remains active in allcells and tissues which do not express Crc.

Some transgenic animals of the invention have both an inactivation ofone or both alleles of Pten and p53 genes and/or one or two null mTetalleles that confer an additional phenotype related to prostate cancer,its pathology or underlying biochemical processes. This disruption canbe achievement by recombinase-mediated excision of Pten, p53 or mTetgenes with embedded LoxP site or by for example LSL cassette knock-in,and RNAi-mediated extinction of these genes either in a germlineconfiguration or in somatic transduction of prostate epithelium in situor in cell culture followed by reintroduction of these primary cellsinto the renal capsule or orthotopically. Other engineering strategiesare also obvious including chimera formation using targeted ES clonesthat avoid germline transmission.

Examples Example 1: General Methods

mTert Knockout Allele, Pten and Trp53 Conditional Alleles.

mTert knockout allele and the Pten^(loxP) conditional knockout alleleshave been described elsewhere (Zheng et al., Nature (2008), Farazi etal., Cancer Res. (2006)). p53^(loxP) strain was generously provided byA. Berns (Marino et al., Genes Dev. (2000)). Prostateepithelium-specific deletion was effected by the PB-Cre4 (Wu et al.,Mech. Dev. (2001) and was obtained from MMHCC(http://mouse.ncifcrf.gov/search_results.asp).

Generation of the LSL-mTERT^(loxP) Allele.

We knock-in the LSL cassette into the first intron (FIG. 1A). Thepresence of the LSL cassette produces a null mTert allele and itsremoval by PB-Cre expression restores activity under the control of thenative mTert promoter. This mouse model scheme allows for excellentspecific control telomerase reconstitution in prostate epithelia cells.Following introduction the construct into ES cells and screening of EScells and germline transmission and NeoR cassette deletion via Ella-Cre,the LSL-mTert allele has been backcrossed 4 generations onto the C57B/V6background.

Mating Scheme.

As depicted in FIG. 6, the LSL-mTERT^(loxP) mice were crossed with G0mTert^(−/+) p53^(L/L)Pten^(L/L)PB-Cre4 mice to generate G0 mTert^(+/+)LSL-mTert^(+/+) p53^(L/L) Pten^(L/L) PB-Cre4, G0 mTert^(+/+)LSL-mTert^(L/+) p53^(L/L) Pten^(L/L) PB-Cre4, and G0 mTert^(+/−)LSL-mTert^(+/+) p53^(L/L) Pten^(L/L) PB-Cre4 mice. These mice were thenintercross to generate G0 mTert^(+/+) LSL-mTert^(+/+) p53^(L/L)Pten^(L/L) PB-Cre4 and G1 mTert^(−/−) LSL-mTert^(+/+) p53^(L/L)Pten^(L/L) PB-Cre4; G1 mTert^(+/−) LSL-mTert^(L/+) p53^(L/L) PB-Cre4,and G1 mTert^(+/+) LSL-mTert^(L/L) p53^(L/L) PB-Cre4. G1 mice were thenintercrossed to generate G2, G3, and G4 mice.

Tissue Analysis.

Normal and tumor tissues were fixed in 10% neutral-buffered formalinovernight then processed, paraffin-embedded, sectioned and stained withhematoxylin and eosin according to standard protocol. Forimmunohistochemistry, 5 micron sections were incubated with primaryantibodies overnight at 4° C. in a humidified chamber. For rabbitantibodies, sections were subsequently developed using Dako Envision.Mouse monoclonal staining was developed using MOM kit (Vector).Representative sections from at least three mice were counted for eachgenotype.

For Western blot analysis, tissues and cells were lysed in RIPA buffer(20 mM Tris pH 7.5, 150 mM NaCl, 1% Nonidet P-40, 0.5% SodiumDeoxycholate, 1 mM EDTA, 0.1% SDS) containing complete mini proteaseinhibitors (Roche) and phosphotase inhibitors. Western blots wereobtained utilizing 20-50 μg of lysate protein, and were incubated withantibodies against HSP70 (610607, BD Transduction Laboratories).

Laser Capture Microdissection and DNA Extraction.

Laser capture microdissection was done as previously described(Emmert-Buck et al., 1996). Genomic DNA of microdissected prostate tumorcells was extracted with phenol-chloroform prior to PCR analysis.

TUNEL Assay.

To determine apoptosis in prostate tumor cells, TUNEL staining wasperformed using the ApopTag Plus peroxidase kit (Chemicon) according toaccording to the manufacture's protocol. To quantify the apoptosis intumor cells, we selected 3 to 5 high-power fields per mouse apoptoticcells were counted by two independent investigators. The percentage ofapoptotic cells from each group of mice was compared.

Cytogenetics, Quantitative Telomere FISH and Spectral KaryotypingAnalysis.

We prepared metaphase chromosomes from prostate tumor cells or earlypassage. We subjected metaphases to Giemsa staining or quantitative FISHanalysis of telomeric sequences with Cy-3-labeled T2AG3 peptide-nucleicacid (PNA) probe. We carried out spectral karyotyping analysis accordingto the manufacturer's recommendations, using mouse chromosome paintprobes (Applied Spectral Imaging) on a Nikon Eclipse 800 microscopeequipped with an ASI interferometer and workstation. Depending on thequality of metaphase spreads, 10-20 metaphases from each sample wereanalyzed in detail.

Establishment of Mouse Prostate Tumor Cell Lines.

Tumors were dissected from prostates of G0 Pten^(loxp/loxp)Trp53^(loxp/loxp) PB-Cre4⁺, G3 and G4 mice, G3 and G4 mice, minced, anddigested with 0.5% type I collagenase (Invitrogen) as describedpreviously. After filtering through a 40-μm mesh, the trapped fragmentswere plated in tissue culture dishes coated with type I collagen (BDPharmingen). Cell lines were established and maintained in DMEM plus 10%fetal bovine serum (FBS, Omega Scientific), 25 μg/mL bovine pituitaryextract, 5 μg/mL bovine insulin, and 6 ng/mL recombinant human epidermalgrowth factor (Sigma-Aldrich).

RNA Isolation and Real-Time PCR.

Total RNA was extracted using the RNeasy Mini kit (Qiagen) and treatedwith RQ1 RNase-free DNase Set (Promega). Firststrand cDNA wassynthesized using 1 μg of total RNA and SuperscriptIII (Invitrogen).Real-time qPCR was performed in triplicates with a MxPro3000 and SYBRGreenER qPCR mix (Invitrogen). The relative amount of specific mRNA wasnormalized to GAPDH. Primer sequences are available upon request.

Array-CGH Analysis for Minimal Common Regions (MCRs) of ChromosomalAmplifications or Deletions of Prostate Tumors in (LSL-TERT Mice).

The array-CGH data of 18 later generations (G3 or G4) mTert^(+/−)LSL-mTert^(L/+) p53^(L/L) Pten^(L/L) PB-Cre4, and mTert^(+/+)LSL-mTert^(L/L) p53^(L/L) Pten^(L/L) PB-Cre4 mice were analyzed with theMCR algorithm²⁵ to detect focal genomic regions with copy numberalteration (CNA) events in at least two mice. Mouse genome data buildmm9 was used in the analysis. A total of 2183 genes from 57 amplifiedregions (Table 2) and 3531 genes from 38 deletion regions were detectedby the MCR algorithm.

Array-CGH Analysis for Recurrent Focal and Arm-Level Chromosomal CopyNumber Alterations in Human Prostate Tumors with the GISTIC2 Algorithm.

The array-CGH data of 194 human prostate tumors² were analyzed with theGISTIC2 algorithm⁴⁷ to detect focal genomic regions with copy numberalteration (CNA) events. Focal regions with q-values smaller than 0.25are considered significant, which resulted in 16 amplified and 39deleted regions. Arm-level changes with q-values smaller than 0.005 areconsidered significant, which suggested chromosome 7p, 7q, and 8qamplification and 6q, 8p, 12p, 13q, 16q, 17p, and 18q deletion.

Homolog Mapping for CNA Synteny Regions Cross Human and Mouse Tumors.

We used NCBI homologene database (version 39.2) to map human and mousehomolog genes and detect synteny CNA regions. The homologene analysischaracterized 300 amplified genes and 441 deleted genes that commonlyrecurred in human and mouse prostate tumors.

Clinical Outcome Analysis.

The raw Affymetrix HG-U133A expression profiles and clinical informationof 79 prostate cancer patients from Glinsky et al. cohort (Table 2)³were generously provided by Dr. William Gerald. The raw dataset wasanalyzed by MAS5 algorithm. Low-expression probesets with less than 20%present calls across the 79 samples were excluded from the data. Theremaining 13,027 probesets map to 8,763 genes with unique symbols, andthe mean log-transformed probeset levels were used as the geneexpression profiles.

A univariate Cox proportional hazard analysis was conducted using the Rsurvival package for invasion assay positive genes to identify thoseexpression in PCA tumors was positively associated with biochemicalrecurrence (BCR, defined by post-op PSA>0.2 ng/ml) in the Glinsky et al.dataset³.

Kaplan-Meier analysis for the survival difference of the two cancerpatient clusters was conducted using the R survival package.C-statistics analysis was conducted using the R survcomp package. Thestatistical procedures used in the analyses include a bootstrapping stepthat estimates the distribution of C-statistics of all models across10,000 random bootstrapping instances, and a comparative step that usesthe t-test to compare the C-statistics of models and evaluate thestatistical significance. Multivariate Cox proportional hazards modelanalysis with the 4-gene signature was used to estimate the coefficientsof individual genes, which combined the 4-gene expression levels into anintegrated risk score model defined.

Co-Deletion Analysis in Human Clinical Samples.

Based on the results of GISTIC2 analysis, 194 human prostate tumors(Taylor et al., 2010) were classified into 4 groups according to PTENand p53 focal deletion status (FIG. 5E). The numbers of SMAD4 deletionevents (chr18q copy number <−0.1 in log-2 scale) in each group were usedto estimate the significance P value of co-deletion enrichment byFisher's exact test in R environment.

Correlation Analysis Between Gene Copy Numbers and Gene ExpressionChanges.

The Spearman correlation coefficients between individual gene copynumbers and expression levels (both in log-2 scale) in matching sampleswere calculated in R environment to estimate the significance P values.

Oncomine Consensus Analysis.

Six prostate cancer cohorts⁴⁸⁻⁵³ in the Oncomine database(www.oncomine.com) were used to filter our candidate marker gene lists.We tested the following hypotheses: if genes in the amplified regionsare related to invasive phenotypes in any of the 6 cohorts, or if genesin the deleted regions are related to indolent phenotypes in any of the6 cohorts.

Bone Metastasis Related Copy Number Changes.

We tested if genes recurrently amplified or deleted in the wholeprostate cancer cohort of Taylor et al. showed consistent copy numberalteration patterns in tumors with documented bone metastasis (Taylor etal., 2010). For each candidate gene, we counted the number of gene gain(copy number >0.3 in log-2 scale) and loss (copy number <−0.3 in log-2scale) in 14 bone metastasis tumors. Consistent changes are defined ifan amplified CNA gene is more likely to have gain than loss or a deletedCNA gene more likely to be lost than gained in bone metastatic tumors.

Survival Analysis.

We applied Cox proportional hazard regression on biomarkers of interestto get a multivariate linear regression model that best predict thebiochemical recurrence of prostate cancer. Tumors were subsequentlydivided into high-risk and lowrisk groups according to the scores.Kaplan-Meier curves were plotted by R software, and the statisticalsignificance was estimated by log-rank test.

Prognostic Model Construction.

To identify the markers that enhance the existing 4-gene model inpredicting prostate cancer recurrence, we adopted a stepwise forwardselection approach. Using the 4-gene model as the core model, we testeach gene separately to check if adding the marker into the model willenhance the fitness of the multivariate model while keeping theindividual adjusted P values below a threshold.

To construct high-sensitivity and high-specificity recurrence models forlow risk and high risk tumor detection respectively, we adopted astepwise selection algorithm. Briefly speaking, we started fromselecting the best one marker and the optimal expression cutoff thatmaximize sensitivity or specificity, followed by iterating the selectionstep while in each step adding one more PD that best enhanced thecurrent best model until any addition of one can no longer increase theprediction performance.

Example 2: Telomerase Reactivation Enables Emergence of AggressiveProstate Cancers with Skeletal Metastases

A novel inducible telomerase reverse transcriptase (mTert) allele wasgenerated by embedding a Lox-Stop-Lox (LSL) cassette in the first intron(FIG. 1A). Upon successive generational intercrosses of LSL-mTerthomozygous mice, late generations show classical constitutional signs oftelomere dysfunction including reduced body weight, widespread organatrophy, diminished proliferation and increased apoptosis in highlyproliferative tissues, among other phenotypes as reported previously(Lee et al., Nature (1998)) (FIG. 6, FIG. 1B-F) (FIG. 1). The LSL-mTertmice were intercrossed with those possessing the prostate-specific Credeletor transgene, PB-Cre4 (Wu et al., Mech. Dev. (2001)), andconditional knockout alleles of Ptcn (Zheng et al., Nature (2008)) andp53 (Jonkers et al., Nat. Genetr. (2001)), hereafter PB-Pten/p53. Allalleles were backcrossed a minimum of 4 times onto the C57Bl/6 strain.

The PB-Pten/p53 alleles were carried through successive generationalmating of LSL-mTert homozygous mice (FIG. 6), generating‘telomere-intact’ controls (wildtype and LSL-mTert heterozygous mice,designated ‘G0 PB-Pten/p53’) and ‘telomere dysfunctional’ experimentalmice (third and fourth generation LSL-mTert homozygotes, designatedG3/G4 LSL-mTert PB-Pten/p53). In parallel, we generated control andexperimental cohorts of PB-Pten/p53 mice harboring the conventionalmTert null allele (mTert-) (Farazi et al., Cancer Res. (2006)),producing analogous G0 and G3/4 groups for study of telomere dysfunctiononly.

Consistent with previous reports (Chen et al, Nature (2005)), all G0PB-Pten/p53 mice developed rapidly progressive locally invasive prostateadenocarcinomas, resulting in lethal urinary obstruction and renalfailure by 34 weeks of age (FIG. 2A-D); whereas, G3/4 mTert^(−/−)PB-Pten/p53 mice tumors had significantly smaller poorly progressivetumors over the same period (FIG. 2A-B). Notably, G3/4 LSL-mTertPB-Pten/p53 mice developed bulky lethal tumors by 24 weeks of age (FIG.2A-B). Correspondingly, serial histological analyses revealed presenceof high-grade prostate intraepithelial neoplasia (HPIN) by age 9 weeksin all three cohorts. However, G4 mTert^(−/−) PB-Pten/p53 failed toprogress beyond HPIN through 24 weeks of age (Table 1, FIG. 2C-D), apattern consistent with the established role of telomere dysfunction infacilitating cancer initiation yet constraining full malignantprogression (Rudolph et al., Nat. Genet. (2001)), Chang et al., GenesDev. (2003), Gonzalez-Suarez et al., Nat. Genet. (2000), Jaskelioff etal., Oncogene (2009)). In sharp contrast, G0 mTert+/+PB-Pten/p53 andG3/4 LSL-mTert PB-Pten/p53 tumors evolved rapidly to invasiveadenocarcinoma by 24 weeks of age (FIG. 2C-D). A distinctive feature ofthe G3/4 LSL-mTert PB-Pten/p53 was the presence of metastatic lesions inthe lumbar spine (5/20, 25%) (FIG. 2E-F). Thus, telomerase reactivationin the setting of telomere dysfunction and dual deficiencies of Pten andp53 enables full malignant progression including acquisition ofunprecedented tumor biological properties such as bony tumor growth.

Next, we monitored the impact of telomere dysfunction and telomerasereactivation on the molecular and cell biological levels in prostatetumors of each of the models at the same age. Quantitative Telomere-FISHanalysis revealed that telomere reserves were significantly decreased inG4 mTert−/− PB-Pten/p53 samples relative to G0 mTert+/+PB-Pren/p53samples (FIG. 8). G4 LSL-mTert PB-Pten/p53 sample showed significantlonger telomere, compared to the G4 mTert−/− PB-Pten/p53 (FIG. 8).Eroded dysfunctional telomeres generate a DNA damage response (Takai etal. Curr. Biol. (2003), IJpma et al., Mol. Biol. Cell (2003). To furtherassess the functional status of telomeres, we audited the level of DNAdamage signaling via analysis of p53BP1 foci in prostate tumor cells atthe same age. Strong anti-p53BP1 signal was detected in G4 mTert^(−/−)PB-Pten/p53 prostate tumor cells and this signal was greatly reduced inG0 mTert+/+PB-Pren/p53 and LSL-mTert PB-Pten/p53 prostate tumor cells(FIG. 3C-D; n=3 each). Correspondingly, TUNEL, activated Caspase-3, andKi67 assays showed markedly increased apoptosis and decreasedproliferation in the G4 mTert−/− PB-Pren/p53 prostate tumor samplescompared with G0 mTert+/+PB-Pten/p53 and G4 LSL-mTert PB-Pten/p53prostate tumor samples (FIG. 3C-E). These findings are consistent withtelomerase-mediated alleviation of telomere checkpoints in the prostatecancers of the G4 LSL-mTert PB-Pten/p53 model.

Taken together, the molecular and phenotypic characterization of thesethree models demonstrated that telomerase reactivation not only enablesthe bypass of the progression block conferred by telomere dysfunction byquelling the DNA damage signals, but also engenders the acquisition ofnew tumor biological properties (bony tumor growth) not observed intumors which did not experience a period of telomere dysfunction withsubsequent telomerase reactivation in their evolution. This thusprovides the first genetic proof in support of the thesis thattelomerase reactivation and genome stabilization is necessary to drivefull malignant progression in epithelial cancers.

Example 3: Telomere Dysfunction in Murine Prostate Cancers GeneratesRecurrent Copy Number Aberrations with Relevance to Human ProstateCancer

The extensive level of telomere dysfunction in the G4 LSL-mTertPB-Pten/p53 mouse, combined with the onset telomerase activation in theprostate only upon sexual maturity at 5-7 weeks of age (i.e., PB-Cre4 isandrogen-responsive) (Chen et al., Nature (2005), Wu et al., Mech. Dev.(2001)), presumably allowed for the accumulation of baseline instabilityprior to telomerase reactivation. To assess this supposition, weconducted spectral karyotype (SKY) and array-comparative genomehybridization (array-CGH) analyses of G0 mTert+/+PB-Pten/p53 and G4LSL-mTert PB-Pten/p53 prostate cancers. SKY analysis revealed a higherfrequency of chromosomal structural aberrations in the G4 mTertLSL-mTert PB-Pten/p53 tumor samples (n=5) relative to G0mTert+/+PB-Pren/p53 controls (n=4) (FIG. 4A-B; 3.2 versus 1.0 per 100chromosomes, respectively, P<0.05, t-test). These aberrations includedmulticentric chromosomes, non-reciprocal translocations, and p-p, p-qand q-q chromosome arm fusions involving homologous and/ornon-homologous chromosomes (FIG. 4C). In addition, 14/31 G3/4 mTert−/−PB-Pten/p553 mice eventually developed small modestly advanced invasiveprostate cancers which exhibited highly anaplastic features such asnuclear pleomorphism (Table 1, FIG. 2D, FIG. 9A). SKY analysis oftheseG4 mTert−/− PB-Pten/p53 tumors revealed cytogenetic complexitycomparable that of the G4 LSL-mTert PB-Pten/p53 tumors (FIG. 9B-D).

Telomere dysfunction and the ensuing bridge-fusion-breakage processgenerate DNA double-strand breaks that enable regional amplificationsand deletions often at the sites of breakage (O'Hagan et al., 2002).Under biological selection, this process results in enrichment foraberrations at cancer-relevant loci (Maser et al., Nature (2007),Artandi et al., Nature (2000)), prompting us to conduct array-CGH andtranscriptional profile analyses of 18 G3/G4 LSL-mTert PB-Pten/p53tumors (Table 1). Array-CGH revealed 94 copy number alterations (CNAs)encompassing 2183 amplified and 3531 deleted genes (FIG. 10, Table 2showing gene lists). We next asked whether these CNAs were syntenic tothose observed in 194 human prostate cancer profiles possessing 55recurrent focal CNAs and 10 recurrent large chromosomal arm gains orlosses as defined by GISTIC2 algorithm (Beroukhim et al, Proc. Natl.Acad. Sci. U.S.A. (2007)) (FIG. 10). Twenty-two of the 94 murine CNAscorresponded to regions subject to copy number changes in the humanprostate cancer profiles. (P=0.189, permutation test) Thesecross-species comparisons resulted in a significant reduction in thetotal number of genes resident in these conserved CNAs—300 amplifiedgenes and 441 deleted genes (FIG. 10. Table 2). One cross-speciesconserved CNA involving mouse chromosome 15 and human chromosome 8 wasnotable for high recurrence in both species (mouse: 12/18, 67%; human:43/194, 22%)² (Table 2, FIG. 4D). This region contains the prostatecancer-relevant Myc oncogene as well as other known cancer genes such asFDZ6 (Table 2).

To further refine the candidate gene list, each amplified gene wasexamined for gene copy-driven expression in mouse and human samples,while each deleted gene was looked up in COSMIC database (Forbes et al.,Nucleic Acids Res. (2010), Forbes et al., Nucleic Acids Res. (2011)) fornon-synonymous mutation, in PubMeth database (Ongenaert et al., NucleicAcids Res. (2008)) for promoter hypermethylation, and in NCBI Pubmed forcancer mutations in any cancer types. An additional clinical relevancefilter checked whether an amplified or deleted gene is over- orunder-expressed in metastatic tumors versus primary in 6 prostate cancercohorts from Oncomine compendium (Lapointe et al., Proc. Natl. Acad.Sci. U.S.A. (2004), LaTulippe et al., Cancer Res. (2002), Vanaja et al.,Cancer Res. (2003), Varambally et al., Cancer Cell (2005), Yu et al., J.Clin. Oncol. (2004), Holzbeierlein et al., Am. J. Pathol. (2004)). Thisexercise further culled the list to 77 amplified and 151 deleted genes(FIG. 10, Table 3). Pathway enrichment analysis revealed that these77/151 genes showed significant enrichment for cancer-relevant pathwayssuch as cell communication (P=0.002, FDR=0.03), TGF beta signaling(P-0.006, FDR=0.05), fatty acid metabolism (P=0.021, FDR=0.07), and WNTsignaling (P=0.042, FDR=0.11) (Table 4). In contrast, the remaining CNAgenes (223 amplified and 290 deleted) only weakly enriched with WNTsignaling (P=0.046, FDR=0.37).

With regard to TFG beta signaling genes, the deletion of the Smad2/Smad4region in 2/18 (11%) G3/G4 LSL-mTert PB-Pten/p53 tumors was particularlynoteworthy in light of recent work validating the role of Smad4 as abona fide tumor suppressor in prostate cancer in the mouse—specificallythat dual deficiencies of Pten and Smad4 drives prostate cancerprogression (Ding et al., Nature (2011)). In human prostate cancer,there is frequent epigenetic silencing of the SMAD4 promoter in advanceddisease (Aitchison et al., Prostate (2008)) and the SMAD4 region issubject to deletion in approximately 35/194 (18%) human prostate tumors,although the region of deletion is large (Taylor et al., Cancer Cell(2010)).

The occurrence of spontaneous Smad4 deletion in the background of Ptenand p53 deficiency (FIG. 5A) raised the possibility that these threegenetic events may cooperate to drive prostate cancer progression. Todetermine the potential cooperative actions of these genetic events, weexamined co-occurrence in human prostate cancers of the Taylor dataset.The loss of SM4D4 was a significant event together with TP53 and PTENloss in human prostate cancer (FIG. 5B-D, FIG. 17-31) enriched in theprostate metastasis samples (FIG. 5E, p=2.9e-6 by Fisher's exact test).These statistical findings prompted us to secure in vivo geneticevidence of cooperativity via prostate-specific deletion of Smad4, Ptenand/or p53 on a telomere-intact background. These genetic studiesdemonstrated that prostate-specific deletion of all three tumorsuppressors generates a more aggressive prostate cancer phenotyperelative to prostates sustaining single or double deficiencies for Ptenand p53 or Smad4. The life span of the triple deletion of Pten/p53/Smad4is significantly shorter (P<0.0001, logrank test) with median survivaltime 17.05 weeks, while median survival time is 26.3 weeks for doubledeletion of Pten and p53 and 22.8 weeks for double deletion of Pten andSmad4. Most notably, 3/24 of these mice displayed bone metastasis (FIG.5G). Therefore, this in vivo genetic study validated the cooperativeroles of Pren, p53 and Smad4 deficiencies in the progression of prostatecancer.

Kaplan-Meier analysis for biochemical recurrence (BCR, defined bypost-op PSA>0.2 ng/ml) of the two cancer patient clusters in the Tayloret al dataset (2010) was conducted using the R survival package. Thecombined new 17 gene set can significantly enhance the sensitivity andspecificity of SMAD4/PTEN/CCND1/SPP1 dichotomize prostate cancer casesinto low versus high risk groups for BCR in Tayor et al dataset (FIG.15C).

Kaplan-Meier analysis for biochemical recurrence (BCR, defined bypost-op PSA>0.2 ng/ml) of the two cancer patient clusters in the Glinskyet al. dataset (Glinsky et al., J. Clin. Invest (2004)) was conductedusing the R survival package. The combined new 17 gene set (DNAJC15,KIF5B, LECT1, DSG2, ACAA2, ASAP1, LMO7, SVIL, DSC2, PCDH9, SMAD7, WDR7,LAMA3, PCDH8, MKX, MSR1, POLR2K) can significantly enhance thesensitivity and specificity of SMAD4/PTEN/CCND1/SPP1 dichotomizeprostate cancer cases into low versus high risk groups for BCR inGlinsky et al dataset (FIG. 16C).

Example 4: TGFβ/Smad4 Pathway in Prostate Tumors with Bone Metastasis

As a first step to identify molecular events capable of drivingmetastasis to the bone, we asked whether a subset of the 228 candidategenes of Table 3 are subjected to consistent amplification/deletion inthe 14 bone metastasis in the cohort reported by Taylor et al (Taylor etal., Cancer Cell (2010)). Specifically, we interrogated each of the 77amplified or 151 deleted candidates for evidence that it is more likelyto be amplified or deleted in bone metastasis, respectively. Theresultant 113 gene list (comprising of 37 amplified and 76 deleted genesassociated with bone-metastasis) was then enlisted into knowledge-basedpathway analysis. Interestingly, TGFβ signaling genes represented themost significantly enriched network among the 9 significant pathwayswith FDR<0.1 (Table 6; FIG. 33). Corroborating with this pathwayanalysis result is the observation that Smad4 is encompassed by genomicloss in 2 of the 18 (11%) G3/G4 LSL-mTert PBPten/p53 tumor genomes,suggesting that TGFβ signaling and SMAD4 specifically may be targetedduring prostate cancer skeletal metastasis. This is consistent withrecent reports on the pathogenetic and prognostic roles of SMAD4 inhuman prostate cancer (Ding et al., Nature (2011)) and its frequentepigenetic silencing in advanced disease (Aitchison et al., Prostate(2008)).

In summary, pathway analysis of the cross-species conserved gene listtriangulated with the biological phenotype in human prostate cancers ledto the hypothesis that TGFβ/SMAD4 signaling is an important driver ofbone metastasis in the context of Pten and p53 deficiencies. Utilizingthe combined Pten/p53/Smad4 GEM model, we demonstrate the new tumorbiological properties (skeletal metastases) of this GEM model is notpresent in Ptcn/p53 or Pten/Smad4 telomere-intact GEM models. Thisinvention establishes, in a genetic manner, that telomerase reactivationin tumor cells experiencing telomere dysfunction provides a mechanismfor selection of cooperative events required to progress fully andmanifest the tumor biological properties governed by such genomicevents.

Example 5: Evolutionarily Conserved Genomically Altered GenesCorrelating with Bone Metastasis are Prognostic in Human

The in vivo genetic experiment above proving a driver role for Smad4 inbone metastasis suggests that additional genes on our bonemetastasis-associated gene list may have functional importance as well.Since SMAD4 has also been shown to carry prognostic significance (Dinget al., Nature (2011)), we reasoned that prognostic relevance may serveas a surrogate for biological importance. As a proof of concept, wefocused on the 14 genes(ATP5A1/ATP6V1C1/CUL2/CYC1/DCC/ERCC3/MBD2/MTERF/PARD3/PTK2/RBL2/SMAD2/SMAD4/SMAD7) that are represented in the 9 pathways found to besignificantly enriched in the bone-metastasis associated gene list(Table 6). Specifically, we assessed how robustly these 14 genes canstratify risk for biochemical recurrence (BCR>0.2 ng/ml) among the 140patients with outcome annotation (Taylor et al., Cancer Cell (2010)).The overall risk score based on the 14-gene signature was significantlyprognostic of BCR with hazard ratio of 13 (P-value <10⁻¹⁴, overallC-index=0.93, see FIG. 34) by multivariate Cox regression. Furthersupport for these 14 genes as likely drivers of bone metastasisphenotype derived from the observation that they provided independentprognostic value to the previously reported 4-gene signature (comprisingof PTEN/SMAD4/CCND1/SPP1) derived from the Pten/Smad4 model (Ding etal., Nature (2011)). consistent with the fact that bone metastasis wasnot observed in the Pten/Smad4 GEM model (hazard ratio=8.7, P=2.16×10⁻¹³and overall C-index=0.93, see FIG. 34). In particular, combination of14-gene with the 4-gene signature increases the predictive power ofeither gene set alone (hazard ratio=20, P<10⁻¹⁴, and overallC-index=0.96, see FIG. 34).

Taken together, the prognostic correlation of these 14 genes representedin the 9 functional pathways enriched in the bone-metastasis associatedgene set provides the correlative evidence for biological relevance ofthese genes to human prostate cancers. Additionally, these results serveas validation of the integrative approaches adopted by this inventionwhich leverages the clear genotype-phenotype correlation in modelsystems with the power of genomic and bioinformatic analyses toelucidate molecular mechanisms driving bone metastasis in human prostatecancers.

The above-described genetic studies in vivo, together with human andmouse prostate cancer genomic data, provide evidence that telomeredysfunction plays a critical role in prostate cancer initiation andprogression, permitting acquisition of and selection for cancer-relevantgenomic events upon telomerase reactivation. In addition, our studiesestablish first formal proof that telomere dysfunction and subsequenttelomerase activation enables evolving cancers to progress fully andacquire new tumor biological properties including cardinal features ofadvanced human prostate cancer. Finally, comparative oncogenomicanalysis of gene copy number and expression profiles withgenotype-phenotype correlation resulted in identification of genesassociated with progression to bone metastasis, highlighting the utilityof this integrative approach for cancer gene discovery in prostatecancer.

Our inducible telomerase model system enabled genetic analysis of theimpact of physiological endogenous telomerase reactivation in anaturally arising solid tumor with short dysfunctional telomeres. Thesestudies established that telomerase reactivation enabled rapidlyprogressive disease in all cases. At the same time, we established thatantecedent telomere dysfunction enabled the acquisition of genomicevents including those capable of endowing tumors with new biologicalproperties such as bone metastases, a phenotype not observed in G0PB-Pten/p53 tumors (telomere intact). Thus, we conclude that a period oftelomere dysfunction is a mechanism for the development of chromosomalaberrations targeting genes involved in prostate cancer developmentincluding bone metastasis.

We suggest that reactivation of telomerase in setting of pre-existinggenome instability can be a genomic mechanism for selection ofcooperative events required for ultimate progression—in other words, itis not merely a permissive step by removing DNA damage, but tolomerasereactivation is instead an active driver of progression. Theabove-described experimental data provides formal genetic proof for thisthesis. By triangulating the list of genes resident in syntenic sCNAs inmouse and human prostate cancers with biological phenotype in human(e.g. documented bone metastasis), we have defined a prioritized list ofbone-metastasis associated genes. Pathway analysis with this listrevealed dominance of TGFβ/SMAD4 network, coupled with the observationof spontaneously acquired Smad4 genomic loss in two of the mouse tumors,led to the hypothesis that TGF signaling and SMAD4 inactivation is adriver for bone metastasis in prostate cancers. Again, leveraging thepower of genetic engineering in the mouse, we went on to perform thedefinitive genetic validation experiment proving the cooperativity ofp53/Pten/Smad4 co-deletion in driving prostate tumorigenesis andprogression to bone metastasis in vivo.

This invention provides in vivo genetic evidence that telomerasereactivation quells DNA damage signaling and stabilizes the genome of aninitiated cancer to permit cancer progression. This invention alsoprovides the first genetic proof in naturally occurring and initiatedcancer in vivo that telomere dysfunction followed by telomerasere-activation serves as a mechanism for the generation of and selectionfor cancer-relevant genomic alterations to drive progression and newtumor biological hallmarks such as metastasis to bone. Thus, telomeraseserves as an active driver of cancer progression in the setting oftelomere-based crisis. Furthermore, the validation of telomeredysfunction as a relevant genome instability mechanism in prostatecancer, the generation of highly rearranged genomes with syntenicevents, and the in silico documentation that altered genes are enrichedfor cancer relevance collectively provide a system to enhance the miningof complex human prostate cancer genomes to identify genetic eventsgoverning prostate cancer progression.

TABLE 1 Murine prostate cancer model used in this invention. PCA GroupMouse # mTert H&E in bone Sky A 4005 mTert^(+/+) giant invasive A 4145mTert^(+/+) giant invasive A 4361 mTert^(+/+) giant invasive A 4485mTert^(+/+) giant invasive A 4610 mTert^(+/+) giant invasive A 5187mTert^(+/+) giant invasive A 5466 mTert^(+/+) giant invasive A 5468mTert^(+/+) giant invasive A 5810 mTert^(+/+) giant invasive A 6040mTert^(+/+) giant invasive yes A 6337 mTert^(+/+) giant invasive A 6679mTert^(+/+) giant invasive A 6681 mTert^(+/+) giant invasive yes A 6729mTert^(+/+) giant invasive yes A 7250 mTert^(+/+) giant invasive A 7257mTert^(+/+) giant invasive A 7534 mTert^(+/+) giant invasive A 8432mTert^(+/+) giant invasive A 4998 mTert^(+/+) giant invasive yes A 11232mTert^(+/+) giant invasive B 2669 mTert^(−/−) HPIN B 12030 mTert^(−/−)HPIN B 11713 mTert^(−/−) HPIN B 11635 mTert^(−/−) HPIN B 11566mTert^(−/−) HPIN B 11024 mTert^(−/−) HPIN B 10934 mTert^(−/−) HPIN B10538 mTert^(−/−) HPIN B 10026 mTert^(−/−) HPIN B 8738 mTert^(−/−) HPINB 7111 mTert^(−/−) HPIN yes B 7110 mTert^(−/−) HPIN yes B 6834mTert^(−/−) HPIN B 4128 mTert^(−/−) HPIN B 2670 mTert^(−/−) HPIN B 12319mTert^(−/−) HPIN B 12073 mTert^(−/−) HPIN B 11769 mTert^(−/−) small butinvasive B 11714 mTert^(−/−) one lobe HPIN, one lobe invasive tumor B8671 mTert^(−/−) HPIN and invasive B 7825 mTert^(−/−) small but invasivetumor B 7742 mTert^(−/−) one lobe HPIN, one invasive B 7132 mTert^(−/−)invasive B 5817 mTert^(−/−) giant invasive yes B 5382 mTert^(−/−)invasive yes B 4569 mTert^(−/−) HPIN and invasive B 4375 mTert^(−/−)giant invasive B 4370 mTert^(−/−) giant invasive B 4232 mTert^(−/−) onelobe HPIN, one invasive B 4122 mTert^(−/−) big and invasive B 4072mTert^(−/−) big and invasive C 4174 mTertpc^(+/−) giant invasive yes C7106 mTertpc^(+/−) giant invasive yes C 7523 mTertpc^(+/−) giantinvasive C 7525 mTert^(pc−/−) giant invasive C 7526 mTert^(pc−/−) giantinvasive C 8584 mTert^(pc−/−) giant invasive C 8589 mTertpc^(+/−) giantinvasive yes C 8591 mTertpc^(+/−) giant invasive yes C 8781mTert^(pc−/−) giant invasive yes yes C 9492 mTert^(pc−/−) giant invasiveyes C 9493 mTert^(pc−/−) giant invasive C 10025 mTertpc^(+/−) giantinvasive yes C 10118 mTert^(pc−/−) giant invasive C 11563 mTertpc^(+/−)giant invasive C 11649 mTertpc^(+/−) giant invasive C 11756mTert^(pc−/−) giant invasive yes C 11819 mTertpc^(+/−) giant invasive C11923 mTertpc^(+/−) giant invasive C 11959 mTertpc^(−/−) giant invasiveC 11960 mTert^(pc−/−) giant invasive yes

TABLE 2 Copy number driven gene expression gene list. Determinant GeneAmplification Deletion No. Symbol (300 genes) (441 genes) 1 ABCB1amplification 2 ABCB4 amplification 3 ABRA amplification 4 ACN9amplification 5 ADAM22 amplification 6 ADCK5 amplification 7 ADCY8amplification 8 AGPAT6 amplification 9 AKAP9 amplification 10 ANGPT1amplification 11 ANK1 amplification 12 ANKIB1 amplification 13 ANKRD46amplification 14 ANXA13 amplification 15 ARC amplification 16 ARF5amplification 17 ARHGAP39 amplification 18 ARMC1 amplification 19 ASAP1amplification 20 ATAD2 amplification 21 ATP6V0D2 amplification 22ATP6V1C1 amplification 23 AZIN1 amplification 24 BAALC amplification 25BAI1 amplification 26 BOP1 amplification 27 C7ORF23 amplification 28C7ORF62 amplification 29 C7ORF63 amplification 30 C7ORF64 amplification31 C8ORF30A amplification 32 C8ORF47 amplification 33 C8ORF55amplification 34 C8ORF76 amplification 35 C8ORF82 amplification 36C8ORF85 amplification 37 CAPZA2 amplification 38 CAV1 amplification 39CDK14 amplification 40 CDK6 amplification 41 CHCHD7 amplification 42CHRAC1 amplification 43 CLDN12 amplification 44 CNGB3 amplification 45COL14A1 amplification 46 COL22A1 amplification 47 COLEC10 amplification48 COMMD5 amplification 49 COX6C amplification 50 CPNE3 amplification 51CPSF1 amplification 52 CRH amplification 53 CROT amplification 54 CSMD3amplification 55 CTHRC1 amplification 56 CYC1 amplification 57 CYHR1amplification 58 CYP11B1 amplification 59 CYP11B2 amplification 60CYP51A1 amplification 61 DBF4 amplification 62 DCAF13 amplification 63DENND3 amplification 64 DEPDC6 amplification 65 DERL1 amplification 66DGAT1 amplification 67 DLX5 amplification 68 DLX6 amplification 69DLX6-AS amplification 70 DMTF1 amplification 71 DNAJC5B amplification 72DPYS amplification 73 DSCC1 amplification 74 EBAG9 amplification 75EEF1D amplification 76 EFR3A amplification 77 EIF2C2 amplification 78EIF3E amplification 79 EIF3H amplification 80 ENPP2 amplification 81ENY2 amplification 82 EPPK1 amplification 83 EXOSC4 amplification 84EXT1 amplification 85 FAM133B amplification 86 FAM135B amplification 87FAM49B amplification 88 FAM82B amplification 89 FAM83A amplification 90FAM83H amplification 91 FAM84B amplification 92 FAM91A1 amplification 93FBXL6 amplification 94 FBXO32 amplification 95 FBXO43 amplification 96FER1L6 amplification 97 FLJ43860 amplification 98 FOXH1 amplification 99FSCN3 amplification 100 FZD1 amplification 101 FZD6 amplification 102GATAD1 amplification 103 GCC1 amplification 104 GINS4 amplification 105GLCCI1 amplification 106 GML amplification 107 GOLGA7 amplification 108GPAA1 amplification 109 GPIHBP1 amplification 110 GPNMB amplification111 GPR20 amplification 112 GPT amplification 113 GRHL2 amplification114 GRINA amplification 115 GRM3 amplification 116 GRM8 amplification117 GSDMC amplification 118 GSDMD amplification 119 GTPBP10amplification 120 HAS2 amplification 121 HAS2-AS amplification 122HEATR7A amplification 123 HHLA1 amplification 124 HRSP12 amplification125 HSF1 amplification 126 ICA1 amplification 127 IGF2BP3 amplification128 JRK amplification 129 KCNK9 amplification 130 KCNQ3 amplification131 KCNS2 amplification 132 KCNV1 amplification 133 KHDRBS3amplification 134 KIAA0196 amplification 135 KIAA1324L amplification 136KIFC2 amplification 137 KLF10 amplification 138 KLHL38 amplification 139KRIT1 amplification 140 LAPTM4B amplification 141 LRP12 amplification142 LRRC14 amplification 143 LRRC24 amplification 144 LRRC6amplification 145 LY6D amplification 146 LY6E amplification 147 LY6Hamplification 148 LY6K amplification 149 LYN amplification 150 LYNX1amplification 151 LYPD2 amplification 152 MAF1 amplification 153 MAFAamplification 154 MAL2 amplification 155 MAPK15 amplification 156 MATN2amplification 157 MED30 amplification 158 MET amplification 159 MFSD3amplification 160 MIR148A amplification 161 MIR151 amplification 162MIR30B amplification 163 MIR30D amplification 164 MIR486 amplification165 MIR592 amplification 166 MIR875 amplification 167 MOS amplification168 MRPL13 amplification 169 MTBP amplification 170 MTDH amplification171 MTERF amplification 172 MTFR1 amplification 173 MTSS1 amplification174 MYC amplification 175 NAPRT1 amplification 176 NCALD amplification177 NDRG1 amplification 178 NDUFB9 amplification 179 NFKBIL2amplification 180 NIPAL2 amplification 181 NKX6-3 amplification 182 NOVamplification 183 NPVF amplification 184 NRBP2 amplification 185 NSMCE2amplification 186 NUDCD1 amplification 187 NXPH1 amplification 188 OC90amplification 189 ODF1 amplification 190 OPLAH amplification 191 OSR2amplification 192 OXR1 amplification 193 PABPC1 amplification 194 PARP10amplification 195 PAX4 amplification 196 PDE7A amplification 197 PEX1amplification 198 PGCP amplification 199 PHF20L1 amplification 200PKHD1L1 amplification 201 PLAG1 amplification 202 PLEC amplification 203POLR2K amplification 204 POP1 amplification 205 PPP1R16A amplification206 PSCA amplification 207 PTK2 amplification 208 PTP4A3 amplification209 PUF60 amplification 210 PVT1 amplification 211 PYCRL amplification212 RAD21 amplification 213 RECQL4 amplification 214 RGS22 amplification215 RHPN1 amplification 216 RIMS2 amplification 217 RNF139 amplification218 RNF19A amplification 219 RPL30 amplification 220 RPL8 amplification221 RPS20 amplification 222 RRM2B amplification 223 RSPO2 amplification224 RUNDC3B amplification 225 SAMD12 amplification 226 SCRIBamplification 227 SCRT1 amplification 228 SCXA amplification 229 SDC2amplification 230 SDR16C5 amplification 231 SDR16C6 amplification 232SFRP1 amplification 233 SHARPIN amplification 234 SLA amplification 235SLC25A32 amplification 236 SLC25A40 amplification 237 SLC30A8amplification 238 SLC39A4 amplification 239 SLC45A4 amplification 240SLC7A13 amplification 241 SLURP1 amplification 242 SND1 amplification243 SNTB1 amplification 244 SNX31 amplification 245 SPAG1 amplification246 SPATC1 amplification 247 SQLE amplification 248 SRI amplification249 ST3GAL1 amplification 250 ST7 amplification 251 STEAP1 amplification252 STEAP2 amplification 253 STEAP4 amplification 254 STK3 amplification255 SYBU amplification 256 TAC1 amplification 257 TAF2 amplification 258TATDN1 amplification 259 TG amplification 260 TGS1 amplification 261TIGD5 amplification 262 TM7SF4 amplification 263 TMEM65 amplification264 TMEM68 amplification 265 TMEM71 amplification 266 TMEM74amplification 267 TNFRSF11B amplification 268 TOP1MT amplification 269TRAPPC9 amplification 270 TRHR amplification 271 TRIB1 amplification 272TRIM55 amplification 273 TRMT12 amplification 274 TRPS1 amplification275 TSPYL5 amplification 276 TSTA3 amplification 277 TTC35 amplification278 UBA52 amplification 279 UBR5 amplification 280 UTP23 amplification281 VPS13B amplification 282 VPS28 amplification 283 WDR67 amplification284 WDYHV1 amplification 285 WISP1 amplification 286 WWP1 amplification287 YWHAZ amplification 288 ZC3H3 amplification 289 ZFAT amplification290 ZFP41 amplification 291 ZFPM2 amplification 292 ZHX1 amplification293 ZHX2 amplification 294 ZNF250 amplification 295 ZNF251 amplification296 ZNF623 amplification 297 ZNF7 amplification 298 ZNF706 amplification299 ZNF707 amplification 300 ZNF800 amplification 301 ABCC12 deletion302 ABHD3 deletion 303 ACAA2 deletion 304 ACSL1 deletion 305 ADAM29deletion 306 ADCY7 deletion 307 AGA deletion 308 AKAP11 deletion 309AKTIP deletion 310 ALDH7A1 deletion 311 ALPK2 deletion 312 AMMECR1Ldeletion 313 ANKRD29 deletion 314 ANKRD37 deletion 315 AP3S1 deletion316 APC deletion 317 AQP4 deletion 318 ARHGAP12 deletion 319 ARMC4deletion 320 ASAH1 deletion 321 ASB5 deletion 322 ASXL3 deletion 323ATG12 deletion 324 ATP5A1 deletion 325 ATP6V1B2 deletion 326 ATP8B1deletion 327 B4GALT6 deletion 328 BAMBI deletion 329 BIN1 deletion 330BNIP3L deletion 331 BRD7 deletion 332 C13ORF15 deletion 333 C13ORF18deletion 334 C13ORF30 deletion 335 C13ORF31 deletion 336 C13ORF34deletion 337 C16ORF78 deletion 338 C16ORF87 deletion 339 C18ORF10deletion 340 C18ORF21 deletion 341 C18ORF25 deletion 342 C18ORF32deletion 343 C18ORF34 deletion 344 C18ORF45 deletion 345 C18ORF55deletion 346 C18ORF8 deletion 347 C1ORF31 deletion 348 C4ORF41 deletion349 C4ORF47 deletion 350 C5ORF13 deletion 351 CABLES1 deletion 352 CABYRdeletion 353 CAMK4 deletion 354 CASP3 deletion 355 CBLN1 deletion 356CBLN2 deletion 357 CCBE1 deletion 358 CCDC11 deletion 359 CCDC110deletion 360 CCDC111 deletion 361 CCDC112 deletion 362 CCDC122 deletion363 CCDC68 deletion 364 CCNY deletion 365 CD226 deletion 366 CDH2deletion 367 CDKN2AIP deletion 368 CDO1 deletion 369 CELF4 deletion 370CEP120 deletion 371 CHD9 deletion 372 CHST9 deletion 373 CLDN22 deletion374 CLN5 deletion 375 CNDP1 deletion 376 CNDP2 deletion 377 CNOT7deletion 378 COG3 deletion 379 COMMD10 deletion 380 COMMD6 deletion 381CPB2 deletion 382 CPLX4 deletion 383 CREM deletion 384 CSGALNACT1deletion 385 CSNK1G3 deletion 386 CUL2 deletion 387 CXXC1 deletion 388CYB5A deletion 389 CYLD deletion 390 CYP4V2 deletion 391 DACH1 deletion392 DCC deletion 393 DCP2 deletion 394 DCTD deletion 395 DGKH deletion396 DIAPH3 deletion 397 DIS3 deletion 398 DMXL1 deletion 399 DNAJA2deletion 400 DNAJC15 deletion 401 DOK6 deletion 402 DSC1 deletion 403DSC2 deletion 404 DSC3 deletion 405 DSG1 deletion 406 DSG2 deletion 407DSG3 deletion 408 DSG4 deletion 409 DTNA deletion 410 DTWD2 deletion 411DYM deletion 412 EDNRB deletion 413 EFHA2 deletion 414 EIF3J deletion415 ELAC1 deletion 416 ELF1 deletion 417 ELP2 deletion 418 ENO1 deletion419 ENOX1 deletion 420 ENPP6 deletion 421 EPB41L4A deletion 422 EPC1deletion 423 EPSTI1 deletion 424 ERCC3 deletion 425 ESCO1 deletion 426ESD deletion 427 F11 deletion 428 FAM149A deletion 429 FAM170A deletion430 FAM59A deletion 431 FAM69C deletion 432 FAT1 deletion 433 FBXL3deletion 434 FBXO15 deletion 435 FBXO8 deletion 436 FECH deletion 437FEM1C deletion 438 FGF20 deletion 439 FGL1 deletion 440 FHOD3 deletion441 FRG2B deletion 442 FTMT deletion 443 FTO deletion 444 FZD8 deletion445 GALNT1 deletion 446 GALNT7 deletion 447 GALNTL6 deletion 448 GALR1deletion 449 GATA6 deletion 450 GJD4 deletion 451 GLRA3 deletion 452GPM6A deletion 453 GPR17 deletion 454 GPT2 deletion 455 GRAMD3 deletion456 GREB1L deletion 457 GRP deletion 458 GTF2F2 deletion 459 GYPCdeletion 460 HAND2 deletion 461 HAUS1 deletion 462 HDHD2 deletion 463HEATR3 deletion 464 HELT deletion 465 HMGB2 deletion 466 HMGXB4 deletion467 HMOX1 deletion 468 HPGD deletion 469 HRH4 deletion 470 HSD17B4deletion 471 HTR2A deletion 472 IER3IP1 deletion 473 IMPACT deletion 474ING2 deletion 475 INO80C deletion 476 INTS10 deletion 477 IRF2 deletion478 IRF2BP2 deletion 479 IRG1 deletion 480 ISX deletion 481 ITFG1deletion 482 IWS1 deletion 483 KATNAL2 deletion 484 KBTBD7 deletion 485KCNN2 deletion 486 KCTD1 deletion 487 KCTD12 deletion 488 KCTD4 deletion489 KIAA0427 deletion 490 KIAA0564 deletion 491 KIAA1430 deletion 492KIAA1462 deletion 493 KIAA1632 deletion 494 KIAA1704 deletion 495KIAA1712 deletion 496 KIF5B deletion 497 KLF12 deletion 498 KLF5deletion 499 KLHL1 deletion 500 KLHL14 deletion 501 KLKB1 deletion 502LAMA3 deletion 503 LARGE deletion 504 LCP1 deletion 505 LECT1 deletion506 LIMS2 deletion 507 LIPG deletion 508 LMAN1 deletion 509 LMO7deletion 510 LONP2 deletion 511 LOX deletion 512 LOXHD1 deletion 513 LPLdeletion 514 LRCH1 deletion 515 LRP1B deletion 516 LRP2BP deletion 517LRRC63 deletion 518 LYZL1 deletion 519 LZTS1 deletion 520 MALT1 deletion521 MAP3K2 deletion 522 MAP3K8 deletion 523 MAP7 deletion 524 MAPK4deletion 525 MAPRE2 deletion 526 MBD1 deletion 527 MBD2 deletion 528 MBPdeletion 529 MC4R deletion 530 MCC deletion 531 MCM5 deletion 532 ME2deletion 533 MED4 deletion 534 MEP1B deletion 535 MEX3C deletion 536MIB1 deletion 537 MICB deletion 538 MIR1-2 deletion 539 MIR187 deletion540 MIR383 deletion 541 MIR759 deletion 542 MKX deletion 543 MLF1IPdeletion 544 MOCOS deletion 545 MPP7 deletion 546 MRO deletion 547 MSR1deletion 548 MTMR7 deletion 549 MTNR1A deletion 550 MTPAP deletion 551MTRF1 deletion 552 MTUS1 deletion 553 MYCBP2 deletion 554 MYLK3 deletion555 MYO5B deletion 556 MYO7B deletion 557 MZT1 deletion 558 N4BP1deletion 559 NAA16 deletion 560 NARS deletion 561 NAT1 deletion 562 NAT2deletion 563 NDFIP2 deletion 564 NEDD4L deletion 565 NEIL3 deletion 566NETO1 deletion 567 NETO2 deletion 568 NKD1 deletion 569 NOD2 deletion570 NOL4 deletion 571 NPC1 deletion 572 NUDT15 deletion 573 NUFIP1deletion 574 NUTF2 deletion 575 ODZ3 deletion 576 OLFM4 deletion 577ONECUT2 deletion 578 OR1D2 deletion 579 ORC6 deletion 580 OSBPL1Adeletion 581 PAPD5 deletion 582 PARD3 deletion 583 PCDH17 deletion 584PCDH20 deletion 585 PCDH8 deletion 586 PCDH9 deletion 587 PCM1 deletion588 PDGFRL deletion 589 PDLIM3 deletion 590 PGGT1B deletion 591 PHAXdeletion 592 PHKB deletion 593 PIAS2 deletion 594 PIBF1 deletion 595PIK3C3 deletion 596 PMAIP1 deletion 597 POLI deletion 598 POLR2Ddeletion 599 POU4F1 deletion 600 PPIC deletion 601 PRDM6 deletion 602PROC deletion 603 PRR16 deletion 604 PSD3 deletion 605 PSMA8 deletion606 PSTPIP2 deletion 607 RAB18 deletion 608 RAB27B deletion 609 RAP1GAP2deletion 610 RAX deletion 611 RBBP8 deletion 612 RBL2 deletion 613 RBM26deletion 614 RBM34 deletion 615 REEP5 deletion 616 RIOK3 deletion 617RIT2 deletion 618 RNF125 deletion 619 RNF138 deletion 620 RNF165deletion 621 RNF219 deletion 622 ROCK1 deletion 623 RPGRIP1L deletion624 RPL17 deletion 625 RPRD1A deletion 626 RSL24D1 deletion 627 RTTNdeletion 628 SALL1 deletion 629 SAP130 deletion 630 SAP30 deletion 631SCARNA17 deletion 632 SCEL deletion 633 SCRG1 deletion 634 SEC11Cdeletion 635 SEMA6A deletion 636 SERP2 deletion 637 SETBP1 deletion 638SFI1 deletion 639 SFT2D3 deletion 640 SGCZ deletion 641 SH2D4A deletion642 SIAH1 deletion 643 SIAH3 deletion 644 SIGLEC15 deletion 645 SKA1deletion 646 SKOR2 deletion 647 SLAIN1 deletion 648 SLC14A1 deletion 649SLC14A2 deletion 650 SLC18A1 deletion 651 SLC25A30 deletion 652 SLC25A4deletion 653 SLC25A46 deletion 654 SLC35F3 deletion 655 SLC39A6 deletion656 SLC7A2 deletion 657 SLITRK1 deletion 658 SLITRK6 deletion 659 SMAD2deletion 660 SMAD4 deletion 661 SMAD7 deletion 662 SNCAIP deletion 663SNORA31 deletion 664 SNORD58B deletion 665 SNRPD1 deletion 666 SNX2deletion 667 SNX20 deletion 668 SNX24 deletion 669 SNX25 deletion 670SOCS6 deletion 671 SORBS2 deletion 672 SPATA4 deletion 673 SPCS3deletion 674 SPERT deletion 675 SPG11 deletion 676 SPRY2 deletion 677SRFBP1 deletion 678 SRP19 deletion 679 SS18 deletion 680 ST8SIA3deletion 681 ST8SIA5 deletion 682 STARD4 deletion 683 STARD6 deletion684 STOX2 deletion 685 SUCLA2 deletion 686 SUGT1 deletion 687 SVILdeletion 688 SYT4 deletion 689 TAF4B deletion 690 TARBP1 deletion 691TBC1D4 deletion 692 TCF4 deletion 693 TDRD3 deletion 694 TICAM2 deletion695 TLR3 deletion 696 TMED7 deletion 697 TMEM188 deletion 698 TMX3deletion 699 TNFAIP8 deletion 700 TNFSF11 deletion 701 TOM1 deletion 702TOMM20 deletion 703 TOX3 deletion 704 TPT1 deletion 705 TRIM36 deletion706 TSC22D1 deletion 707 TSHZ1 deletion 708 TSLP deletion 709 TTC39Cdeletion 710 TTR deletion 711 TUSC3 deletion 712 TXNL1 deletion 713UCHL3 deletion 714 UFSP2 deletion 715 VEGFC deletion 716 VPS35 deletion717 VPS37A deletion 718 WAC deletion 719 WBP4 deletion 720 WDR17deletion 721 WDR33 deletion 722 WDR36 deletion 723 WDR7 deletion 724WWC2 deletion 725 YTHDC2 deletion 726 ZADH2 deletion 727 ZBTB7C deletion728 ZC3H13 deletion 729 ZDHHC2 deletion 730 ZEB1 deletion 731 ZNF236deletion 732 ZNF24 deletion 733 ZNF397 deletion 734 ZNF407 deletion 735ZNF423 deletion 736 ZNF438 deletion 737 ZNF474 deletion 738 ZNF516deletion 739 ZNF521 deletion 740 ZNF608 deletion 741 ZSCAN30 deletion

TABLE 3 Integrative approach further combined the cross-speciesamplifications and deletions with publicly retrieved cancer mutation,methylation, and transcriptome data, generating a list of 77 amplifiedand 151 deleted PCA prognostic determinants (PDs). Determi- AMP nant No.PDs GeneName DEL upP.Ho upP.Lap upP.LaT upP.van upP.Var upP.Yu Count 8PD1 AGPAT6 AMP NA NA NA 0.217946 0.0264  NA 1 9 PD2 AKAP9 AMP 0.054715.51E−04 0.23576 0.703938 0.22508 0.45363 1 12 PD3 ANKIB1 AMP NA0.003032 NA 0.521475 3.63E−04 NA 2 13 PD4 ANKRD46 AMP 0.88957 0.0297420.89119 0.090499 0.03486 0.83392 2 18 PD5 ARMC1 AMP NA 0.021098 NA0.601844 6.46E−04 NA 2 19 PD6 ASAP1 AMP NA 0.036458 NA 0.067086 1.63E−04NA 2 20 PD7 ATAD2 AMP NA 0.055993 NA 0.056803 1.05E−04 NA 1 22 PD8ATP6V1C1 AMP 0.12421 0.085423 0.01446 0.054752 3.77E−04 1.30E−07 3 23PD9 AZIN1 AMP 0.03956 0.915142 0.04987 0.33621 0.02103 0.016  4 34 PD10C8ORF76 AMP NA 0.007872 NA 0.442677 6.17E−04 NA 2 41 PD11 CHCHD7 AMP NA0.79397  NA 0.836287 0.00229 NA 1 48 PD12 COMMD5 AMP NA 7.77E−04 NA0.197694 0.01312 NA 2 49 PD13 COX6C AMP 0.03296 0.009102 0.142330.348064 3.86E−04 0.00991 4 50 PD14 CPNE3 AMP 0.04525 0.361751 0.049140.808221 0.41518 9.09E−04 3 56 PD15 CYC1 AMP 0.00814 0.002747 0.014220.321037 0.14611 1.49E−05 4 64 PD16 DEPDC6 AMP NA 0.001757 NA 0.2763840.98597 NA 1 65 PD17 DERL1 AMP NA 0.089832 NA 0.12877 7.00E−04 NA 1 70PD18 DMTF1 AMP 0.52708 NA 0.0662  0.627951 0.1129  1.04E−04 1 72 PD19DPYS AMP NA 0.012606 0.75837 0.210724 0.99865 0.94823 1 73 PD20 DSCC1AMP NA 5.83E−04 NA 0.417512 7.42E−05 NA 2 74 PD21 EBAG9 AMP 0.64939 NA0.2489  0.820623 0.72444 1.87E−06 1 75 PD22 EEF1D AMP 0.27258 0.0045460.15316 0.857826 0.07527 0.1703  1 76 PD23 EFR3A AMP 0.73414 0.0079940.66914 0.387171 0.91297 0.99997 1 77 PD24 EIF2C2 AMP NA 0.00389 0.00155 0.044785 7.16E−04 1.14E−07 5 78 PD25 EIF3E AMP 0.46338 0.1195770.45561 0.354651 0.99588 0.00228 1 79 PD26 EIF3H AMP 0.12312 0.48177 0.34815 0.038412 0.03399 0.03522 3 81 PD27 ENY2 AMP NA 0.117552 NA0.247419 0.04459 NA 1 84 PD28 EXT1 AMP 0.61787 0.989534 0.37532 0.351698.86E−04 3.23E−06 2 87 PD29 FAM49B AMP NA 0.327191 NA 0.016178 0.02423NA 2 88 PD30 FAM82B AMP NA 0.001928 NA 0.496775 0.24049 NA 1 102 PD31GATAD1 AMP 0.2912  0.789965 0.08302 0.591964 0.01181 0.23735 1 104 PD32GINS4 AMP NA NA NA 0.048529 0.02057 NA 2 114 PD33 GRINA AMP 0.009861.05E−04 0.02378 0.604278 0.03164 0.11769 4 124 PD34 HRSP12 AMP 0.021420.007667 0.07278 0.163316 0.6865  0.00552 3 134 PD35 KIAA0196 AMP0.09645 NA 0.72763 0.539342 0.43622 0.04678 1 139 PD36 KRIT1 AMP 0.150870.020978 0.004  0.013417 0.0041  2.70E−06 5 154 PD37 MAL2 AMP NA0.062341 NA 0.882402 0.00326 NA 1 169 PD38 MTBP AMP NA NA NA 0.562280.02574 NA 1 170 PD39 MTDH AMP NA 0.001712 0.01448 0.274721 0.008390.58569 3 171 PD40 MTERF AMP NA 9.82E−04 7.97E−06 0.080102 0.179240.88131 2 172 PD41 MTFR1 AMP 0.02683 0.022331 0.04408 0.302828 0.017491.99E−07 5 185 PD42 NSMCE2 AMP NA 0.007916 NA 0.098115 0.00553 NA 2 186PD43 NUDCD1 AMP NA 0.04879  NA 0.481766 0.00157 NA 2 193 PD44 PABPC1 AMP0.03294 0.148872 0.07105 0.414925 0.01604 3.11E−06 3 196 PD45 PDE7A AMPNA 0.132618 NA 0.263113 0.00769 NA 1 199 PD46 PHF20L1 AMP NA 0.001288 NA0.221656 3.82E−04 NA 2 203 PD47 POLR2K AMP 0.34593 0.083489 0.008140.087445 0.01102 0.00538 3 204 PD48 POP1 AMP NA 2.68E−05 0.533520.017861 0.00149 0.93518 3 207 PD49 PTK2 AMP 0.01782 0.586117 0.003110.494837 3.32E−04 0.17466 3 209 PD50 PUF60 AMP 0.02648 2.41E−04 0.0223 0.415597 0.03684 0.3536  4 212 PD51 RAD21 AMP  0.000309 0.0169633.91E−06 0.054738 0.00979 0.00209 5 217 PD52 RNF139 AMP 0.67852 0.0268080.2851  0.647388 0.97281 0.13614 1 218 PD53 RNF19A AMP NA 0.001562 NA0.033254 0.00144 NA 3 221 PD54 RPS20 AMP 0.50492 0.171712 0.932140.89304 0.02302 0.8905  1 245 PD55 SPAG1 AMP NA 0.032037 NA 0.1412830.0026  NA 2 247 PD56 SQLE AMP 0.07232 0.012876 0.02986 0.0556755.94E−06 4.05E−05 4 248 PD57 SRI AMP 0.76963 0.045205 0.84885 0.7499640.99808 0.99972 1 254 PD58 STK3 AMP 0.22197 0.645308 0.08308 0.8333530.19697 0.00612 1 257 PD59 TAF2 AMP 0.02777 3.40E−05 0.00317 0.1537460.00342 0.08931 4 260 PD60 TGS1 AMP NA 0.015925 NA 0.117524 0.02268 NA 2263 PD61 TMEM65 AMP NA 0.016029 NA 0.140141 7.65E−04 NA 2 264 PD62TMEM68 AMP NA 0.027752 NA 0.503164 0.91584 NA 1 268 PD63 TOP1MT AMP NANA NA 0.61768 0.01247 NA 1 269 PD64 TRAPPC9 AMP NA NA NA 0.7044280.01772 NA 1 277 PD65 TTC35 AMP 0.09176 0.642902 0.24713 0.6495070.78633 0.01509 1 279 PD66 UBR5 AMP 0.05752 0.002116 0.01938 0.0973850.11532 0.00156 3 280 PD67 UTP23 AMP NA NA NA 0.222203 0.00257 NA 1 281PD68 VPS13B AMP 0.1728  0.750331 0.16772 0.2504 0.00562 1     1 283 PD69WDR67 AMP NA 0.003203 0.00791 0.002834 2.68E−04 0.06765 4 284 PD70WDYHV1 AMP NA 0.022828 NA 0.585875 0.00276 NA 2 286 PD71 WWP1 AMP0.06192 0.179087 0.00851 0.982408 0.47898 0.00766 2 287 PD72 YWHAZ AMP0.00301 0.021078 0.00106 0.209565 0.00211 0.16303 4 292 PD73 ZHX1 AMP NA0.007453 NA 0.347984 0.48004 NA 1 294 PD74 ZNF250 AMP NA 0.0466380.35616 0.207615 0.02819 1     2 296 PD75 ZNF623 AMP 0.02117 NA 0.0263 0.528602 0.00404 0.04365 4 297 PD76 ZNF7 AMP 0.09469 0.097333 0.178410.304134 0.04002 0.0552  1 298 PD77 ZNF706 AMP NA 0.002898 NA 0.0016596.31E−05 NA 3 303 PD78 ACAA2 DEL 0.6979  0.713777 0.21716 0.5284370.60627 0.00291 1 308 PD79 AKAP11 DEL 0.07568 0.06287  0.51308 0.2613389.05E−04 8.08E−14 2 310 PD80 ALDH7A1 DEL 0.00103 0.012384 0.188020.005912 8.50E−05 0.79381 4 312 PD81 AMMECR1L DEL NA 0.971619 NA0.729332 2.68E−04 NA 1 313 PD82 ANKRD29 DEL NA NA NA 0.764164 0.01788 NA1 316 PD83 APC DEL NA 0.413391 0.10162 0.055214 5.19E−04 0.92753 1 319PD84 ARMC4 DEL NA NA NA 0.381929 0.00212 NA 1 322 PD85 ASXL3 DEL NA NANA 0.036787 0.57869 NA 1 324 PD86 ATP5A1 DEL 0.12771 NA 0.23898 0.5440660.01035 4.19E−10 2 326 PD87 ATP8B1 DEL 0.92236 0.241627 0.84238 0.0986110.02747 8.00E−08 2 328 PD88 BAMBI DEL 0.8714  0.87102  0.92974 0.5930130.91453 0.02099 1 329 PD89 BIN1 DEL 0.30556 0.783858 0.04217 0.4455222.18E−05 1.23E−07 3 343 PD90 C18ORF34 DEL NA NA NA NA 0.00849 NA 1 353PD91 CAMK4 DEL NA NA 0.05123 0.558773 3.73E−04 0.01588 2 360 PD92CCDC111 DEL NA 0.992872 NA 0.617153 0.01235 NA 1 368 PD93 CDO1 DEL NA NA0.00112 0.33874 1.58E−05 0.32779 2 371 PD94 CHD9 DEL  0.0000783 0.5053190.16024 0.005594 0.00581 0.0888  3 376 PD95 CNDP2 DEL NA 0.100467 NA0.018005 0.00106 NA 2 378 PD96 COG3 DEL NA 0.048584 NA 0.120685 7.08E−05NA 2 384 PD97 CSGALNACT1 DEL NA 0.022367 NA 0.221202 0.03097 NA 2 386PD98 CUL2 DEL 0.08496 0.154956 0.60765 0.297283 0.5558  0.00108 1 389PD99 CYLD DEL 0.12284 5.60E−04 0.08827 0.098299 0.04277 4.24E−05 3 391PD100 DACH1 DEL 0.04717 2.78E−04 0.75243 1.04E−05 5.61E−04 0.01364 5 392PD101 DCC DEL NA NA 0.76934 0.010995 0.12412 0.30337 1 398 PD102 DMXL1DEL 0.03873 0.021287 0.2   0.045559 0.00148 1.73E−04 5 400 PD103 DNAJC15DEL NA 0.083352 NA 0.326864 0.03102 NA 1 403 PD104 DSC2 DEL 0.645280.173192 0.14328 0.184634 0.0088  0.07863 1 404 PD105 DSC3 DEL NA0.300896 0.33796 0.367441 4.03E−04 0.27093 1 405 PD106 DSG1 DEL NA NA0.39228 0.766262 0.88679 5.27E−05 1 406 PD107 DSG2 DEL 0.5299  0.0113010.99471 0.180832 0.02411 1.29E−09 3 407 PD108 DSG3 DEL NA 0.9305950.75291 0.264154 0.01178 0.0134  2 412 PD109 EDNRB DEL NA 6.41E−040.11328 0.837327 0.49336 0.29616 1 416 PD110 ELF1 DEL 0.01746 3.61E−040.67021 0.573557 0.00157 5.13E−07 4 421 PD111 EPB41L4A DEL NA 0.002507NA 2.38E−04 0.00389 NA 3 422 PD112 EPC1 DEL NA 0.358127 NA 0.2217986.70E−05 NA 1 424 PD113 ERCC3 DEL 0.90023 0.116141 0.99225 0.7542620.00243 3.25E−11 2 430 PD114 FAM59A DEL NA 0.032444 NA 0.568481 0.86792NA 1 432 PD115 FAT1 DEL 0.50121 0.126415 0.66327 0.352319 9.01E−045.57E−04 2 435 PD116 FBXO8 DEL NA 0.008653 NA 0.525566 6.52E−04 NA 2 437PD117 FEM1C DEL 0.01312 0.001389 0.02931 0.506632 0.00908 0.84918 4 440PD118 FHOD3 DEL NA 0.936624 NA 0.017802 1.23E−04 NA 2 445 PD119 GALNT1DEL 0.73362 0.012908 0.50227 0.815828 0.16483 0.00579 2 446 PD120 GALNT7DEL NA NA NA 0.199106 0.0129  NA 1 451 PD121 GLRA3 DEL NA NA 0.016790.079112 0.18684 0.01318 2 459 PD122 GYPC DEL 0.01527 0.312389 0.039870.754247 0.05458 0.02278 3 466 PD123 HMGXB4 DEL 0.55213 0.010162 0.879680.401971 0.03682 0.14951 2 468 PD124 HPGD DEL 0.94438 0.311853 0.636030.263057 1.60E−04 0.99523 1 469 PD125 HRH4 DEL NA NA NA 2.75E−04 0.28199NA 1 470 PD126 HSD17B4 DEL 0.35847 0.012563 0.18397 0.106383 0.263160.98271 1 471 PD127 HTR2A DEL NA 0.61408  0.32928 0.034459 0.269770.98817 1 473 PD128 IMPACT DEL NA 0.03976  NA 0.24518 2.28E−04 NA 2 477PD129 IRF2 DEL 0.40337 3.01E−05 0.05021 0.589592 0.64778 2.74E−10 2 481PD130 ITFG1 DEL NA 0.001175 NA 0.301084 0.00633 NA 2 482 PD131 IWS1 DELNA 0.87708  NA 0.409111 1.95E−05 NA 1 484 PD132 KBTBD7 DEL NA 0.002473NA 0.454506 4.06E−04 NA 2 485 PD133 KCNN2 DEL NA 0.648566 NA 0.3097860.04231 NA 1 486 PD134 KCTD1 DEL NA 0.359142 NA 0.17875 0.02743 NA 1 487PD135 KCTD12 DEL 0.68739 NA 0.73943 0.791982 0.03583 0.00117 2 490 PD136KIAA0564 DEL 0.01864 0.208034 0.10815 0.019385 0.12997 0.02986 3 492PD137 KIAA1462 DEL NA 0.00392  0.99365 0.4981 5.94E−04 0.00402 3 493PD138 KIAA1632 DEL NA 0.016507 NA 0.298085 0.59906 NA 1 494 PD139KIAA1704 DEL NA 0.374666 NA 0.297732 0.00521 NA 1 496 PD140 KIF5B DEL0.62434 0.005821 0.75259 0.139132 0.01765 1.01E−10 3 498 PD141 KLF5 DELNA 0.550446 0.17449 0.004124 0.70257 4.86E−06 2 502 PD142 LAMA3 DEL0.99605 0.895323 0.5649  0.333792 0.04813 0.48107 1 503 PD143 LARGE DEL 0.000108 NA 0.06287 0.012723 0.21091 0.10481 2 505 PD144 LECT1 DEL NANA 0.44164 0.857727 0.17327 0.0063  1 506 PD145 LIMS2 DEL NA NA NA0.016367 1.20E−05 NA 2 509 PD146 LMO7 DEL NA 3.51E−05 0.31931 0.0442790.1613  0.51242 2 514 PD147 LRCH1 DEL NA 9.07E−05 0.06598 0.5081124.13E−05 0.65919 2 515 PD148 LRP1B DEL NA NA NA 0.042386 0.03827 NA 2520 PD149 MALT1 DEL 0.7125  0.375954 0.21517 0.222936 1.80E−04 0.0033  2521 PD150 MAP3K2 DEL NA 0.200777 NA 0.458006 0.00415 NA 1 522 PD151MAP3K8 DEL NA 0.391258 0.09963 0.575423 3.48E−04 0.71054 1 525 PD152MAPRE2 DEL 0.11916 0.115541 0.35263 0.570368 0.00673 0.01713 2 526 PD153MBD1 DEL 0.20051 0.854254 0.19116 0.539335 0.03601 2.20E−06 2 527 PD154MBD2 DEL 0.41398 0.943347 0.58623 0.491569 0.01613 0.99383 1 530 PD155MCC DEL 0.04802 0.02228  0.01823 0.465124 0.00118 2.39E−06 5 533 PD156MED4 DEL NA 0.04561  NA 0.175621 0.00275 NA 2 534 PD157 MEP1B DEL NA NA0.06071 0.13458 0.24143 0.0013  1 536 PD158 MIB1 DEL NA 0.031467 NA0.212107 0.00245 NA 2 542 PD159 MKX DEL NA 3.00E−05 NA 2.23E−07 0.00325NA 3 547 PD160 MSR1 DEL NA NA 0.39676 0.608892 0.03636 0.76899 1 552PD161 MTUS1 DEL 0.0118  0.001754 0.13901 0.373327 0.00867 1.93E−04 4 553PD162 MYCBP2 DEL 0.03559 0.826307 0.08272 0.264376 0.02378 0.95813 2 554PD163 MYLK3 DEL NA NA NA 0.001233 0.05612 NA 1 555 PD164 MYO5B DEL NA0.001229 NA 0.019524 0.39535 NA 2 563 PD165 NDFIP2 DEL NA 0.010929 NA0.366434 0.01484 NA 2 570 PD166 NOL4 DEL NA NA 5.27E−04 0.00356 0.998150.92422 2 580 PD167 OSBPL1A DEL NA 0.115354 0.3728  0.598388 0.549761.08E−05 1 582 PD168 PARD3 DEL 0.13101 0.036693 0.46334 0.034202 0.092060.05407 2 584 PD169 PCDH20 DEL NA NA NA 0.011108 0.3747  NA 1 585 PD170PCDH8 DEL 0.00374 NA 0.06886 0.006767 0.18985 1.86E−06 3 586 PD171 PCDH9DEL NA NA 0.1196  0.009807 0.00246 0.84164 2 587 PD172 PCM1 DEL 0.126570.088236 0.42525 0.305345 0.02894 0.09336 1 589 PD173 PDLIM3 DEL  0.00000035 0.011612 1.14E−04 0.005655 1.30E−07 6.04E−08 6 592 PD174PHKB DEL 0.60874 0.005086 0.61018 0.110771 0.00421 0.01393 3 594 PD175PIBF1 DEL 0.02274 0.021214 0.00177 0.643912 0.00194 1.40E−08 5 595 PD176PIK3C3 DEL NA 4.14E−06 0.01703 0.015241 0.00755 5.04E−08 5 599 PD177POU4F1 DEL 0.02705 NA 0.02277 0.015565 0.70829 7.97E−13 4 600 PD178 PPICDEL 0.55976 0.436404 0.4587  0.546142 0.01208 0.52358 1 603 PD179 PRR16DEL NA NA NA 0.014057 0.20716 NA 1 604 PD180 PSD3 DEL 0.75813 0.0431680.71882 0.037462 0.53236 0.03524 3 612 PD181 RBL2 DEL 0.01891 NA 0.006770.577087 7.34E−04 1.22E−07 4 614 PD182 RBM34 DEL 0.95278 0.8033860.71189 0.340654 0.00673 0.00464 2 616 PD183 RIOK3 DEL 0.38886 0.0031890.38143 0.047268 7.85E−04 0.1299  3 617 PD184 RIT2 DEL NA NA 0.8599 0.723954 0.62373 0.00516 1 622 PD185 ROCK1 DEL 0.11675 0.096195 0.4045 0.613222 0.53805 1.61E−08 1 625 PD186 RPRD1A DEL NA 0.043234 NA 0.1602930.33259 NA 1 628 PD187 SALL1 DEL NA NA 0.24073 0.783036 0.99063 0.012421 629 PD188 SAP130 DEL NA 0.799904 NA 0.639156 0.01606 NA 1 632 PD189SCEL DEL NA 0.667132 0.28102 0.370258 0.30819 7.65E−05 1 637 PD190SETBP1 DEL 0.00384 NA 0.00239 0.06506 0.11477 0.00103 3 642 PD191 SIAH1DEL 0.00407 7.00E−04 0.72178 0.253773 0.2575  7.26E−07 3 648 PD192SLC14A1 DEL NA 0.074271 0.01664 6.22E−04 3.63E−10 2.67E−04 4 651 PD193SLC25A30 DEL NA NA NA 0.458145 0.01013 NA 1 652 PD194 SLC25A4 DEL0.13327 0.005203 0.28397 0.215689 0.00734 0.00323 3 654 PD195 SLC35F3DEL NA NA NA 0.37405 0.01648 NA 1 655 PD196 SLC39A6 DEL 0.23761 0.0151130.06698 0.152996 0.05085 0.01062 2 656 PD197 SLC7A2 DEL NA 0.0728730.07386 0.031598 0.0335  0.99975 2 658 PD198 SLITRK6 DEL NA NA NA1.95E−04 0.92441 NA 1 659 PD199 SMAD2 DEL   0.00000231 0.014063 0.0605 0.14373 0.09569 5.85E−04 3 660 PD200 SMAD4 DEL 0.00293 0.00911  0.165430.145061 0.00271 1.17E−11 4 661 PD201 SMAD7 DEL 0.00216 0.762807 0.193130.173849 0.19091 9.94E−06 2 662 PD202 SNCAIP DEL NA 0.460247 NA 0.0279340.73285 NA 1 666 PD203 SNX2 DEL 0.17666 0.001511 0.89104 0.3516310.00555 0.1049  2 669 PD204 SNX25 DEL NA 0.136012 NA 0.177924 0.00347 NA1 670 PD205 SOCS6 DEL NA 0.034125 0.8803  0.645836 0.03332 0.61859 2 671PD206 SORBS2 DEL 0.00028 0.006641 0.00708 0.044607 3.61E−04 1.96E−08 6675 PD207 SPG11 DEL 0.45483 NA 0.22203 0.289656 0.05715 1.44E−06 1 676PD208 SPRY2 DEL 0.1516  0.001152 0.09289 0.082545 0.00122 0.09065 2 680PD209 ST8SIA3 DEL NA NA 0.09362 0.312924 0.41551 0.0082  1 681 PD210ST8SIA5 DEL NA NA 0.33435 0.653832 0.97966 3.72E−04 1 682 PD211 STARD4DEL NA 0.016624 NA 0.666443 0.06161 NA 1 685 PD212 SUCLA2 DEL 0.2074 1.24E−04 0.92863 0.033839 2.87E−04 0.98193 3 687 PD213 SVIL DEL 0.0000503 5.84E−04 0.0016  0.0735 9.62E−06 1.04E−10 5 689 PD214 TAF4BDEL NA 0.671957 0.65177 0.680769 0.0941  0.00914 1 690 PD215 TARBP1 DEL0.51313 0.728908 0.43663 0.305718 4.24E−04 2.69E−16 2 691 PD216 TBC1D4DEL 0.66718 0.069162 0.29313 0.398181 0.28044 2.02E−05 1 692 PD217 TCF4DEL 0.10138 0.008672 0.30894 0.458196 0.01253 0.25621 2 695 PD218 TLR3DEL 0.80762 NA 0.56764 0.036034 0.00128 1.99E−04 3 711 PD219 TUSC3 DEL0.33271 5.51E−04 0.15414 0.092742 0.10787 3.83E−08 2 718 PD220 WAC DELNA 0.22034  NA 0.565345 0.01428 NA 1 719 PD221 WBP4 DEL  0.00000185.94E−05 0.01444 0.130772 3.02E−05 3.58E−04 5 722 PD222 WDR36 DEL NA0.179801 NA 0.45116 0.00871 NA 1 723 PD223 WDR7 DEL 0.10795 0.0193510.02239 0.659934 0.9946  3.94E−10 3 724 PD224 WWC2 DEL NA 0.19701  NA0.016602 0.00216 NA 2 725 PD225 YTHDC2 DEL 0.0525  0.101509 0.510290.306786 8.38E−04 0.01205 2 730 PD226 ZEB1 DEL NA 0.18408  NA 0.1659950.00284 NA 1 735 PD227 ZNF423 DEL NA 5.20E−04 0.05894 0.067857 0.034634.62E−07 3 740 PD228 ZNF608 DEL NA 0.007611 NA 0.163085 0.00153 NA 2

TABLE 4 Pathway enrichment analysis of 228 genes (77 amplified and 151deleted) PCA progression determinants. nSet_(—) nHit_(—) NameDescription P FDR Gene Gene Hit_Genes HSA01430_CELL_(—) Genes 0.001830.0311 97 6 DSC2/DSC3/ COMMUNICATION involved in cell DSG1/DSG2/communication DSG3/LAMA3 HSA04350_TGF_(—) Genes 0.00599 0.0509 87 5RBL2/ROCK1/ BETA_SIGNALING_(—) involved in SMAD2/SMAD4/ PATHWAY TGF-betaSMAD7 signaling pathway HSA04520_(—) Genes 0.0168 0.0696 74 4LMO7/PARD3/ ADHERENS_(—) involved in SMAD2/SMAD4 JUNCTION adherensjunction HSA00280_VALINE_(—) Genes 0.017 0.0696 41 3 ACAA2/LEUCINE_AND_(—) involved in ALDH7A1/ ISOLEUCINE_(—) valine, leucineHSD17B4 DEGRADATION and isoleucine degradation HSA00071_FATTY_(—) Genes0.0205 0.0696 44 3 ACAA2/ ACID_METABOLISM involved in ALDH7A1/ fattyacid HSD17B4 metabolism HSA04310_WNT_(—) Genes 0.0419 0.115 143 5APC/ROCK1/ SIGNALING_(—) involved in SIAH1/ PATHWAY Wnt signaling SMAD2/pathway SMAD4

TABLE 5 37 amplified and 76 deleted genes that were specifically alteredin 14 human bone metastatic tumors in Taylor et al data set (2010). AMP,gene amplification; DEL, gene deletion; BM, bone metastasis. DeterminantAMP or Enriched No. Name DEL in BM 308 AKAP11 DEL BM 9 AKAP9 AMP BM 312AMMECRIL DEL BM 12 ANKIB1 AMP BM 18 ARMC1 AMP BM 319 ARMC4 DEL BM 19ASAP1 AMP BM 322 ASXL3 DEL BM 324 ATP5A1 DEL BM 22 ATP6V1C1 AMP BM 326ATP8B1 DEL BM 23 AZIN1 AMP BM 328 BAMBI DEL BM 329 BIN1 DEL BM 343C18ORF34 DEL BM 41 CHCHD7 AMP BM 371 CHD9 DEL BM 378 COG3 DEL BM 386CUL2 DEL BM 56 CYC1 AMP BM 389 CYLD DEL BM 391 DACH1 DEL BM 392 DCC DELBM 70 DMTF1 AMP BM 400 DNAJC15 DEL BM 72 DPYS AMP BM 75 EEF1D AMP BM 76EFR3A AMP BM 77 EIF2C2 AMP BM 416 ELF1 DEL BM 422 EPC1 DEL BM 424 ERCC3DEL BM 87 FAM49B AMP BM 440 FHOD3 DEL BM 445 GALNT1 DEL BM 102 GATAD1AMP BM 114 GRINA AMP BM 466 HMGXB4 DEL BM 471 HTR2A DEL BM 481 ITFG1 DELBM 482 IWS1 DEL BM 484 KBTBD7 DEL BM 134 KIAA0196 AMP BM 490 KIAA0564DEL BM 492 KIAA1462 DEL BM 493 KIAA1632 DEL BM 494 KIAA1704 DEL BM 496KIF5B DEL BM 498 KLF5 DEL BM 139 KRIT1 AMP BM 503 LARGE DEL BM 505 LECT1DEL BM 506 LIMS2 DEL BM 514 LRCH1 DEL BM 521 MAP3K2 DEL BM 522 MAP3K8DEL BM 525 MAPRE2 DEL BM 526 MBD1 DEL BM 527 MBD2 DEL BM 533 MED4 DEL BM542 MKX DEL BM 171 MTERF AMP BM 172 MTFR1 AMP BM 554 MYLK3 DEL BM 555MYO5B DEL BM 570 NOL4 DEL BM 185 NSMCE2 AMP BM 582 PARD3 DEL BM 584PCDH20 DEL BM 585 PCDH8 DEL BM 586 PCDH9 DEL BM 196 PDE7A AMP BM 199PHF20L1 AMP BM 592 PHKB DEL BM 594 PIBF1 DEL BM 595 PIK3C3 DEL BM 207PTK2 AMP BM 209 PUF60 AMP BM 612 RBL2 DEL BM 617 RIT2 DEL BM 217 RNF139AMP BM 221 RPS20 AMP BM 628 SALL1 DEL BM 629 SAP130 DEL BM 637 SETBP1DEL BM 642 SIAH1 DEL BM 648 SLC14A1 DEL BM 651 SLC25A30 DEL BM 658SLITRK6 DEL BM 659 SMAD2 DEL BM 660 SMAD4 DEL BM 661 SMAD7 DEL BM 675SPG11 DEL BM 247 SQLE AMP BM 248 SRI AMP BM 680 ST8SIA3 DEL BM 681ST8SIA5 DEL BM 685 SUCLA2 DEL BM 687 SVIL DEL BM 692 TCF4 DEL BM 260TGS1 AMP BM 263 TMEM65 AMP BM 264 TMEM68 AMP BM 268 TOP1MT AMP BM 269TRAPPC9 AMP BM 279 UBR5 AMP BM 718 WAC DEL BM 719 WBP4 DEL BM 723 WDR7DEL BM 730 ZEB1 DEL BM 735 ZNF423 DEL BM 296 ZNF623 AMP BM 298 ZNF706AMP BM

TABLE 6 Pathway enrichment analysis of bone metastasis of 113 gene set.P values were adjusted by false detection rate (FDR). nSet_(—) nHit_(—)Hit_(—) Name Description P FDR Gene Gene Genes REACTOME_SIGNALING_(—)http://www.broadinstitute.org/ 0.000106 0.0015 15 3 SMAD2/ BY_TGF_BETAgsea/msigdb/cards/REACTOME_(—) SMAD4/ SIGNALING_BY_TGF_BETA.html SMAD7BIOCARTA_TGFB_(—) http://www.broadinstitute.org/ 0.000187 0.0015 18 3SMAD2/ PATHWAY gsea/msigdb/cards/BIOCARTA_(—) SMAD4/ TGFB_PATHWAY.htmlSMAD7 KEGG_TGF_BETA_(—) http://www.broadinstitute.org/ 0.00188 0.0101 834 RBL2/ SIGNALING_PATHWAY gsea/msigdb/cards/KEGG_(—) SMAD2/TGF_BETA_SIGNALING_(—) SMAD4/ PATHWAY.html SMAD7 KEGG_COLORECTAL_(—)http://www.broadinstitute.org/ 0.00679 0.0272 61 3 DCC/ CANCERgsea/msigdb/cards/KEGG_(—) SMAD2/ COLORECTAL_CANCER.html SMAD4KEGG_ADHERENS_(—) http://www.broadinstitute.org/ 0.0107 0.0343 72 3PARD3/ JUNCTION gsea/msigdb/cards/KEGG_(—) SMAD2/ ADHERENS_JUNCTION.htmlSMAD4 REACTOME_RNA_(—) http://www.broadinstitute.org/ 0.0162 0.0432 84 3ERCC3/ POLYMERASE_I_III_(—) gsea/msigdb/cards/REACTOME_(—) MBD2/AND_MITOCHONDRIAL_(—) RNA_POLYMERASE_I_III_(—) MTERF TRANSCRIPTIONAND_MITOCHONDRIAL_(—) TRANSCRIPTION.html KEGG_CELL_CYCLEhttp://www.broadinstitute.org/ 0.035 0.0748 113 3 RBL2/gsea/msigdb/cards/KEGG_(—) SMAD2/ CELL_CYCLE.html SMAD4KEGG_OXIDATIVE_(—) http://www.broadinstitute.org/ 0.0374 0.0748 116 3ATP5A1/ PHOSPHORYLATION gsea/msigdb/cards/KEGG_(—) ATP6V1C1/OXIDATIVE_(—) CYC1 PHOSPHORYLATION.html KEGG_PATHWAYS_(—)http://www.broadinstitute.org/ 0.0469 0.0834 310 5 CUL2/ IN_CANCERgsea/msigdb/cards/KEGG_(—) DCC/ PATHWAYS_IN_CANCER.html PTK2/ SMAD2/SMAD4

TABLE 7 Two-Determinant Combinations ATP5A1 ATP6V1C1 CUL2 CYC1 DCC ERCC3MBD2 MTERF PARD3 PTK2 RBL2 SMAD2 ATP5A1 + + + + + + + + + + +ATP6V1C1 + + + + + + + + + + CUL2 + + + + + + + + + CYC1 + + + + + + + +DCC + + + + + + + ERCC3 + + + + + + MBD2 + + + + + MTERF + + + +PARD3 + + + PTK2 + + RBL2 + SMAD2 SMAD4 SMAD7 DNAJC15 KIF5B LECT1 DSG2ACAA2 ASAP1 LMO7 SVIL DSC2 PCDH9 WDR7 LAMA3 PCDH8 MKX MSR1 POLR2K PTENCyclin D1 SMAD4 SMAD7 DNAJC15 KIF5B LECT1 DSG2 ACAA2 ASAP1 LMO7 SVILDSC2 ATP5A1 + + + + + + + + + + + ATP6V1C1 + + + + + + + + + + +CUL2 + + + + + + + + + + + CYC1 + + + + + + + + + + +DCC + + + + + + + + + + + ERCC3 + + + + + + + + + + +MBD2 + + + + + + + + + + + MTERF + + + + + + + + + + +PARD3 + + + + + + + + + + + PTK2 + + + + + + + + + + +RBL2 + + + + + + + + + + + SMAD2 + + + + + + + + + + +SMAD4 + + + + + + + + + + SMAD7 + + + + + + + + +DNAJC15 + + + + + + + + KIF5B + + + + + + + LECT1 + + + + + +DSG2 + + + + + ACAA2 + + + + ASAP1 + + + LMO7 + + SVIL + DSC2 PCDH9 WDR7LAMA3 PCDH8 MKX MSR1 POLR2K PTEN Cyclin D1 PCDH9 WDR7 LAMA3 PCDH8 MKXMSR1 POLR2K PTEN Cyclin D1 SPP1 ATP5A1 + + + + + + + + + +ATP6V1C1 + + + + + + + + + + CUL2 + + + + + + + + + +CYC1 + + + + + + + + + + DCC + + + + + + + + + +ERCC3 + + + + + + + + + + MBD2 + + + + + + + + + +MTERF + + + + + + + + + + PARD3 + + + + + + + + + +PTK2 + + + + + + + + + + RBL2 + + + + + + + + + +SMAD2 + + + + + + + + + + SMAD4 + + + + + + + + + +SMAD7 + + + + + + + + + + DNAJC15 + + + + + + + + + +KIF5B + + + + + + + + + + LECT1 + + + + + + + + + +DSG2 + + + + + + + + + + ACAA2 + + + + + + + + + +ASAP1 + + + + + + + + + + LMO7 + + + + + + + + + +SVIL + + + + + + + + + + DSC2 + + + + + + + + + +PCDH9 + + + + + + + + + WDR7 + + + + + + + + LAMA3 + + + + + + +PCDH8 + + + + + + MKX + + + + + MSR1 + + + + POLR2K + + + PTEN + +Cyclin D1 +

REFERENCE LIST

-   1. Ding, Z. et al. SMAD4-dependent barrier constrains prostate    cancer growth and metastatic progression. Nature 470, 269-273    (2011).-   2. Taylor, B. S. et al. Integrative genomic profiling of human    prostate cancer. Cancer Cell 18, 11-22 (2010).-   3. Glinsky, G. V., Glinskii, A. B., Stephenson, A. J.,    Hoffman, R. M. & Gerald, W. L. Gene expression profiling predicts    clinical outcome of prostate cancer. J. Clin. Invest 113, 913-923    (2004).-   4. Sharpless, N. E. & DePinho, R. A. The mighty mouse: genetically    engineered mouse models in cancer drug development. Nat. Rev. Drug    Discov. 5, 741-754 (2006).-   5. Shen, M. M. & Abate-Shen, C. Molecular genetics of prostate    cancer: new prospects for old challenges. Genes Dev. 24, 1967-2000    (2010).-   6. Li, J. et al. PTEN, a putative protein tyrosine phosphatase gene    mutated in human brain, breast, and prostate cancer. Science 275,    1943-1947 (1997).-   7. Guo, Y., Sklar, G. N., Borkowski, A. & Kyprianou, N. Loss of the    cyclin-dependent kinase inhibitor p27(Kip1) protein in human    prostate cancer correlates with tumor grade. Clin. Cancer Res. 3,    2269-2274 (1997).-   8. Majumder, P. K. et al. A prostatic intraepithelial    neoplasia-dependent p27 Kip1 checkpoint induces senescence and    inhibits cell proliferation and cancer progression. Cancer Cell 14,    146-155 (2008).-   9. Tomlins, S. A. et al. Recurrent fusion of TMPRSS2 and ETS    transcription factor genes in prostate cancer. Science 310, 644-648    (2005).-   10. Rubin, M. A. Targeted therapy of cancer: new roles for    pathologists—prostate cancer. Mod. Pathol. 21 Suppl 2, S44-S55    (2008).-   11. Abate-Shen, C., Shen, M. M. & Gelmann, E. Integrating    differentiation and cancer: The Nkx3.1 homeobox gene in prostate    organogenesis and carcinogenesis. Differentiation (2008).-   12. Tomlins, S. A. et al. The role of SPINK1 in ETS    rearrangement-negative prostate cancers. Cancer Cell 13, 519-528    (2008).-   13. Jenkins, R. B., Qian, J., Lieber, M. M. & Bostwick, D. G.    Detection of c-myc oncogene amplification and chromosomal anomalies    in metastatic prostatic carcinoma by fluorescence in situ    hybridization. Cancer Res. 57, 524-531 (1997).-   14. Acevedo, V. D., Ittmann, M. & Spencer, D. M. Paths of    FGFR-driven tumorigenesis. Cell Cycle 8, 580-588 (2009).-   15. Rubin, M. A. et al. E-cadherin expression in prostate cancer: a    broad survey using high-density tissue microarray technology. Hum.    Pathol. 32, 690-697 (2001).-   16. Chaib, H. et al. Activated in prostate cancer: a PDZ    domain-containing protein highly expressed in human primary prostate    tumors. Cancer Res. 61, 2390-2394 (2001).-   17. Dhanasekaran, S. M. et al. Delineation of prognostic biomarkers    in prostate cancer. Nature 412, 822-826 (2001).-   18. Rubin, M. A. et al. alpha-Methylacyl coenzyme A racemase as a    tissue biomarker for prostate cancer. JAMA 287, 1662-1670 (2002).-   19. Varambally, S. et al. Genomic Loss of microRNA-101 Leads to    Overexpression of Histone Methyltransferase EZH2 in Cancer. Science    (2008).-   20. Varambally, S. et al. The polycomb group protein EZH2 is    involved in progression of prostate cancer. Nature 419, 624-629    (2002).-   21. Min, J. et al. An oncogene-tumor suppressor cascade drives    metastatic prostate cancer by coordinately activating Ras and    nuclear factor-kappaB. Nat. Med. 16, 286-294 (2010).-   22. Chen, Z. et al. Crucial role of p53-dependent cellular    senescence in suppression of Pten-deficient tumorigenesis. Nature    436, 725-730 (2005).-   23. Kim, M. et al. Comparative oncogenomics identifies NEDD9 as a    melanoma metastasis gene. Cell 125, 1269-1281 (2006).-   24. Zender, L. et al. Identification and validation of oncogenes in    liver cancer using an integrative oncogenomic approach. Cell 125,    1253-1267 (2006).-   25. Maser, R. S. et al. Chromosomally unstable mouse tumours have    genomic alterations similar to diverse human cancers. Nature 447,    966-971 (2007).-   26. DePinho, R. A. The age of cancer. Nature 408, 248-254 (2000).-   27. Artandi, S. E. et al. Telomere dysfunction promotes    non-reciprocal translocations and epithelial cancers in mice. Nature    406, 641-645 (2000).-   28. Chin, L. et al. p53 deficiency rescues the adverse effects of    telomere loss and cooperates with telomere dysfunction to accelerate    carcinogenesis. Cell 97, 527-538 (1999).-   29. O'Hagan, R. C. et al. Telomere dysfunction provokes regional    amplification and deletion in cancer genomes. Cancer Cell 2, 149-155    (2002).-   30. Rudolph, K. L., Millard, M., Bosenberg, M. W. & DePinho, R. A.    Telomere dysfunction and evolution of intestinal carcinoma in mice    and humans. Nat. Genet. 28, 155-159 (2001).-   31. Chin, K. et al. In situ analyses of genome instability in breast    cancer. Nat. Genet. 36, 984-988 (2004).-   32. Feldmann, G., Beaty, R., Hruban, R. H. & Maitra, A. Molecular    genetics of pancreatic intraepithelial neoplasia. J Hepatobiliarv.    Pancreat. Surg. 14, 224-232 (2007).-   33. Meeker, A. K. et al. Telomere shortening is an early somatic DNA    alteration in human prostate tumorigenesis. Cancer Res. 62,    6405-6409 (2002).-   34. Stratton, M. R., Campbell, P. J. & Futreal, P. A. The cancer    genome. Nature 458, 719-724 (2009).-   35. Sommerfeld, H. J. et al. Telomerase activity: a prevalent marker    of malignant human prostate tissue. Cancer Res. 56, 218-222 (1996).-   36. Vukovic, B. et al. Evidence of multifocality of telomere erosion    in high-grade prostatic intraepithelial neoplasia (HPIN) and    concurrent carcinoma. Oncogene 22, 1978-1987 (2003).-   37. Hahn, W. C. et al. Inhibition of telomerase limits the growth of    human cancer cells. Nat. Med. 5, 1164-1170 (1999).-   38. Chang, S., Khoo, C. M., Naylor, M. L., Maser, R. S. &    DePinho, R. A. Telomere-based crisis: functional differences between    telomerase activation and ALT in tumor progression. Genes Dev. 17,    88-100 (2003).-   39. Kallakury, B. V. et al. Telomerase activity in human benign    prostate tissue and prostatic adenocarcinomas. Diagn. Mol. Pathol.    6, 192-198 (1997).-   40. Lin, Y. et al. Telomerase activity in primary prostate    cancer. J. Urol. 157, 1161-1165 (1997).-   41. Koeneman, K. S. et al. Telomerase activity, telomere length, and    DNA ploidy in prostatic intraepithelial neoplasia (PIN). J. Urol.    160, 1533-1539 (1998).-   42. Zhang, W., Kapusta, L. R., Slingcrland, J. M. & Klotz, L. H.    Telomerase activity in prostate cancer, prostatic intraepithelial    neoplasia, and benign prostatic epithelium. Cancer Res. 58, 619-621    (1998).-   43. Zheng, H. et al. p53 and Pten control neural and glioma    stem/progenitor cell renewal and differentiation. Nature 455,    1129-1133 (2008).-   44. Farazi, P. A., Glickman, J., Horner, J. & DePinho, R. A.    Cooperative interactions of p53 mutation, telomere dysfunction, and    chronic liver damage in hepatocellular carcinoma progression. Cancer    Res. 66, 4766-4773 (2006).-   45. Marino, S., Vooijs. M., van der, G. H., Jonkers, J. & Berns, A.    Induction of medulloblastomas in p53-null mutant mice by somatic    inactivation of Rb in the external granular layer cells of the    cerebellum. Genes Dev. 14, 994-1004 (2000).-   46. Wu, X. et al. Generation of a prostate epithelial cell-specific    Crc transgenic mouse model for tissue-specific gene ablation. Mech.    Dev. 101, 61-69 (2001).-   47. Beroukhim, R. et al. Assessing the significance of chromosomal    aberrations in cancer: methodology and application to glioma. Proc.    Natl. Acad. Sci. U.S.A 104, 20007-20012 (2007).-   48. Holzbcierlein, J. et al. Gene expression analysis of human    prostate carcinoma during hormonal therapy identifies    androgen-responsive genes and mechanisms of therapy resistance.    Am. J. Pathol. 164, 217-227 (2004).-   49. Lapointe, J. et al. Gene expression profiling identifies    clinically relevant subtypes of prostate cancer. Proc. Natl. Acad.    Sci. U.S.A 101, 811-816 (2004).-   50. LaTulippe, E. et al. Comprehensive gene expression analysis of    prostate cancer reveals distinct transcriptional programs associated    with metastatic disease. Cancer Res. 62, 4499-4506 (2002).-   51. Vanaja, D. K., Cheville, J. C., Iturria, S. J. & Young, C. Y.    Transcriptional silencing of zinc finger protein 185 identified by    expression profiling is associated with prostate cancer progression.    Cancer Res. 63, 3877-3882 (2003).-   52. Varambally, S. et al. Integrative genomic and proteomic analysis    of prostate cancer reveals signatures of metastatic progression.    Cancer Cell 8, 393-406 (2005).-   53. Yu, Y. P. et al. Gene expression alterations in prostate cancer    predicting tumor aggression and preceding development of    malignancy. J. Clin. Oncol. 22, 2790-2799 (2004).-   54. Lee, H. W. et al. Essential role of mouse telomerase in highly    proliferative organs. Nature 392, 569-574 (1998).-   55. Jonkers, J. et al. Synergistic tumor suppressor activity of    BRCA2 and p53 in a conditional mouse model for breast cancer. Nat.    Genet. 29, 418-425 (2001).-   56. Gonzalez-Suarez, E., Samper, E., Flores, J. M. & Blasco, M. A.    Telomerase-deficient mice with short telomeres are resistant to skin    tumorigenesis. Nat. Genet. 26, 114-117 (2000).-   57. Jaskeliotf, M. et al. Telomerase deficiency and telomere    dysfunction inhibit mammary tumors induced by polyomavirus middle T    oncogene. Oncogene 28, 4225-4236 (2009).-   58. Takai, H., Smogorzcwska, A. & de Lange, T. DNA damage foci at    dysfunctional telomeres. Curr. Biol. 13, 1549-1556 (2003).-   59. IJpma, A. S. & Greider, C. W. Short telomeres induce a DNA    damage response in Saccharomyces cerevisiae. Mol. Biol. Cell 14,    987-1001 (2003).-   60. Forbes, S. A. et al. COSMIC (the Catalogue of Somatic Mutations    in Cancer): a resource to investigate acquired mutations in human    cancer. Nucleic Acids Res. 38, D652-D657 (2010).-   61. Forbes, S. A. et al. COSMIC: mining complete cancer genomes in    the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 39,    D945-D950 (2011).-   62. Ongenaert, M. et al. PubMeth: a cancer methylation database    combining text-mining and expert annotation. Nucleic Acids Res. 36,    D842-D846 (2008).-   63. Holzbeierlein, J. et al. Gene expression analysis of human    prostate carcinoma during hormonal therapy identifies    androgen-responsive genes and mechanisms of therapy resistance. Am.    J Pathol. 164, 217-227 (2004).-   64. Aitchison, A. A. et al. Promoter methylation correlates with    reduced Smad4 expression in advanced prostate cancer. Prostate 68,    661-674 (2008).

1-89. (canceled)
 90. A method of detecting CUL2 protein in a cancerpatient, said method comprising: obtaining a first tissue sample fromthe patient; and detecting whether CUL2 protein is present in the sampleby contacting the tissue sample with an anti-CUL2 antibody or fragmentthereof, and detecting binding between CUL2 protein and the anti-CUL2antibody or fragment thereof.
 91. The method of claim 90, which furthercomprises detecting DERL1 protein in the first tissue sample or a secondtissue sample from the patient.
 92. The method of claim 91, whereindetecting DERL1 protein in the first tissue sample or second tissuesample comprises detecting whether DERL1 protein is present in thesample by contacting the first tissue sample or second tissue samplewith an anti-DERL1 antibody or fragment thereof, and detecting bindingbetween DERL1 protein and the anti-DERL1 antibody or fragment thereof.93. The method of claim 90, which further comprises detecting SMAD4protein in the first tissue sample or a second tissue sample from thepatient.
 94. The method of claim 93, wherein detecting SMAD4 protein inthe first tissue sample or second tissue sample comprises detectingwhether SMAD4 protein is present in the sample by contacting the firsttissue sample or second tissue sample with an anti-SMAD4 antibody orfragment thereof, and detecting binding between SMAD4 protein and theanti-SMAD4 antibody or fragment thereof.
 95. The method of claim 90,wherein detecting CUL2 comprises performing immunofluorescence.
 96. Themethod of claim 90, wherein the first tissue sample comprises prostatecancer tissue.
 97. The method of claim 91, wherein the second tissuesample comprises prostate cancer tissue.
 98. A method of detecting DERL1protein in a cancer patient, said method comprising: obtaining a firsttissue sample from the patient; and detecting whether DERL1 protein ispresent in the sample by contacting the tissue sample with an anti-DERL1antibody or fragment thereof, and detecting binding between DERL1protein and the anti-DERL1 antibody or fragment thereof.
 99. The methodof claim 98, wherein the first tissue sample comprises prostate cancertissue.
 100. A reaction mixture comprising a prostate tissue sample and:a CUL2 probe or a CUL2 antibody or fragment thereof, or a DERL1 probe ora DERL1 antibody or fragment thereof.
 101. A reaction mixture comprisinga prostate tissue sample and a DERL1 antibody or fragment thereof boundto a detectable label.